[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: improve performance of a script

From: Eduardo A . Bustamante López
Subject: Re: improve performance of a script
Date: Wed, 26 Mar 2014 06:47:12 -0700
User-agent: Mutt/1.5.21 (2010-09-15)

(I forgot to CC the list in my first reply)

On Tue, Mar 25, 2014 at 07:12:16AM -0700, xeon Mailinglist wrote:
> For each file inside the directory $output, I do a cat to the file and 
> generate a sha256 hash. This script takes 9 minutes to read 105 files, with 
> the total data of 556MB and generate the digests. Is there a way to make this 
> script faster? Maybe generate digests in parallel?
> for path in $output
> do
>     # sha256sum
>     digests[$count]=$( $HADOOP_HOME/bin/hdfs dfs -cat "$path" | sha256sum | 
> awk '{ print $1 }')
>     (( count ++ ))
> done
> Thanks,
You were already told in #bash at Freenode that this is not a bash
issue, and yet, you report it as a bug.

Once bash runs the commands, it has no relation at all with their

Rather, ask the Hadoop people and also maybe the support for your
operating system to see what you can do to optimize that. Maybe it
cannot be optimized... it depends on what the bottleneck is (disk,
network, etc.)

Eduardo Alan Bustamante López

reply via email to

[Prev in Thread] Current Thread [Next in Thread]