How s3fs caching

…or one-day just to add one line of code

s3fs.S3FileSystem.cachable = False

Adding caching under the good and not mentioning it in the documentation – that is called a dirty trick.

My case was lambda processing s3 files. When a file comes on s3 lambda process the file and triggers next lambda. The next lambda works fine only the first time.

The first lambda is using only boto3 and there is no problem.

The second lambda use s3fs.

The second invocation of the lambda is using already initialized context and the s3fs thinks that it knows what objects are on s3 but it is wrong!

So…. I found this issue – thank you jalpes196 !

Another way is to invalidate the cache…

from s3fs.core import S3FileSystem

s3 = S3FileSystem(anon=False)

Daily systemd commands

Continue reading

Why one should use Firefox in 2020

I have switched from Google Chrome to chromium for security and privacy issues. Now I am switching from Chromium to Firefox because of many issues.

Chromium stopped to ship deb packages and start using Snapd. Snap runs in cgroup (probably) and hides very important folders from the OS

  • /tmp
  • ~/.ssh

My access to some payment website was rejected because the certificates are in the ~/.ssh

System tmp
When I download some junk files/attachments I store them in the /tmp folder and on the next system reboot, my /tmp is cleaned. When I can’t access the /tmp from Chrome I have started using ~/tmp/ and have tons of useless files.

When I switch to firefox I noticed that this browser is much faster than Chrome.

Chromuim after migrating to snapd do not work correctly with dbus

Firefox is faster

No easy way to add a custom search engine.

Sort AWS S3 keys by size

A naive version which will sort the keys in a s3 folder. It would not work if the keys contain spaces.

Here is a usage example

aws s3 ls BUCKETNAME/signals/wifi/  |  ~/bin/aws-s3-sort.rb
content =

lines = content.split("\n")

key_size = {}
lines.each do |line|
  cells = line.split(' ')
  key_size[cells[2]] = cells[3]
end; ''

sorted = key_size.sort_by { |k, v| k.to_i }.to_h

sorted.each do |key, value|
  puts "#{key} -> #{value}"
end; ''

Extract a huge number of files from AWS s3 glacier

…or how to run multiple commands in parallel

Continue reading

Dynamodb Table Export and local Import

Continue reading

Pull remote files from sftp server

Those days we almost use cloud for everthing. But sometimes we need to pull files from sftp server. Here are two solutions for that

Pull and remove with sftp

This solution pulls the files then removes them from the remote. There is a gotcha that if you expect a lot of files there might be a chance a file to arrive while the “get -r …” command is executing. Then the “rm *” will remove it. So this is suitable if you expect a few files a week/day

Create a

get -r upload/* incoming/
rm upload/*

Then add cron

0 5 * * * /usr/bin/sftp -b [email protected]

Only pulling with lftp

When I don’t have permissions to remove the files from the remote sftp I use the following off-the-shelf aproach.

This cron is synchronizing files all files to /home/USERNAME/incoming

0 5 * * *  /usr/bin/lftp -u USERNAME,none -e 'mirror --newer-than="now-7days" --only-newer --exclude .ssh --only-missing / /home/USERNAME/incoming; quit' s

deploy pg gem with postgres 10

When in your distribution the postgres is stick to version 10 and you have to upgrade to postgres-11 a good way to do a capistrano deploy is like this

Do the system install with

yum install postgresql10-contrib postgresql10-devel

And then in your /shared/.bundle/config add a line showing the location of the pg libraries

BUNDLE_PATH: "/opt/application/shared/bundle"
BUNDLE_BUILD__PG: "--with-pg-config=/usr/pgsql-10/bin/pg_config"
BUNDLE_WITHOUT: "development:test"

Thanks to my colleague Kris for finding the solution.

Organizing terraform modules in application stacks for free

Continue reading

Switch configuration lines using comments

Continue reading

© 2021 Gudasoft

Theme by Anders NorénUp ↑