…or how to run multiple commands in parallel

You can first try s3cmd and if it doesn’t work, go for an advanced solution which supports millions of files.

s3cmd restore \
    --recursive s3://bucket.raw.rifiniti.com \

To bulk request files to be extracted from glacier I use this script. I hope that will be useful to you also

# Get s3 objects from glacier by prefix
# The prefix is optional!
# How to use:
#  ./export-prefix.sh bucketName 30 2019-04-30
#  ./export-prefix.sh bucketName 30
export bucket=$1

# How many days to keep the objects
export day=$2
export prefix=$3

if [ -z "$prefix" ]
  cmd="aws2 s3api list-objects  --bucket $bucket"
  cmd="aws2 s3api list-objects  --bucket $bucket --prefix $prefix"

readarray -t KEYS < <($cmd | jq '.Contents[] |  select( .StorageClass != "STANDARD" ) | ."Key"')
for key in "${KEYS[@]}"; do
  echo "aws s3api restore-object --bucket $bucket --key ${key} --restore-request '{\"Days\":$day,\"GlacierJobParameters\":{\"Tier\":\"Standard\"}}'" >> /tmp/commands.sh

echo "Generated file /tmp/commands.sh"

echo "Splitting the huge file into small files: /tmp/sub-commands*"
split -l 1000 /tmp/commands.sh /tmp/sub-commands.sh.
chmod a+x /tmp/sub-commands*

The script will generate in /tmp/commands.sh file with all the commands that you need to run.

When you have a lot of files it would be not possible to run the bash script because it would be killed at some point. To avoid this, we have to split the /tmp/commands.sh into parts. This is what the last part of the shell script is doing.

Now use this snippet to run the commands file by file.

for x in `ls /tmp/sub-commands*`; do
  echo "working on $x"

Or if you have installed “parallels” you can run much faster with

for x in `ls /tmp/sub-commands*`; do
  echo "working on $x"
  parallel -j 10 < $x

Update: Make the script work with keys containing spaces

Update2: Make it work with a lot of files and add parallel example