[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Help with input line replacement string -
From: |
Craig Carl |
Subject: |
Help with input line replacement string - |
Date: |
Sun, 1 Apr 2012 12:34:41 -0700 |
All -
I need a little help munging a string I need to pass thru parallel.
I'm using parallel to distribute some S3 download tasks using the
s3cmd. s3cmd takes a couple of options -
sc3md get <object to get> <path to put object>
<object to get> is easy, I'm having a hard time with <path to put object>
I get a list of objects to pipe to parallel like this -
#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{ print $4}'
s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/1gram/data
s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/2gram/data
s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/3gram/data
s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/4gram/data
I pipe that to parallel like this -
#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} <path to put object>
I using the above example I need <path to put object> to be -
./ngrams/books/20090715/chi-sim-all/1gram/data
./ngrams/books/20090715/chi-sim-all/2gram/data
./ngrams/books/20090715/chi-sim-all/3gram/data
./ngrams/books/20090715/chi-sim-all/4gram/data
An easy bit of bash will build the string, ${<object to
get>/s3\:\/\/datasets.elasticmapreduce/.} but I can't figure out how
to get that working with parallel. I've tried -
#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} ${{}/s3\:\/\/datasets.elasticmapreduce/.}
#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} "${"{}"/s3\:\/\/datasets.elasticmapreduce/.}"
#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} '${'{}'/s3\:\/\/datasets.elasticmapreduce/.}'
Plus a couple of others, I get a "bad substitution error" no matter
what I try. I'm wondering if there is a way I could build the path as
part of the 's3cmd ls' command and then use {n} to get the path, but
I'm open to any suggestions.
Thanks,
Craig
- Help with input line replacement string -,
Craig Carl <=