bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: using mapfile is extreamly slow compared to oldfashinod ways to read


From: Chet Ramey
Subject: Re: using mapfile is extreamly slow compared to oldfashinod ways to read files
Date: Thu, 26 Mar 2009 16:52:22 -0400
User-agent: Thunderbird 2.0.0.21 (Macintosh/20090302)

Lennart Schultz wrote:

> Bash Version: 4.0
> Patch Level: 10
> Release Status: release
> 
> Description:
> 
> I have a bash script which reads about 250000 lines of xml code generating
> about 850 files with information extracted from the xml file.
> It uses the construct:
> 
> while read line
> do
>    case "$line" in
>    ....
> done < file
> 
> and this takes a little less than 2 minutes
> 
> Trying to use mapfile I changed the above construct to:
> 
> mapfile  < file
> for i in "address@hidden"
> do
>    line=$(echo $i) # strip leading blanks
>    case "$line" in
>    ....
> done
> 
> With this change the job now takes more than 48 minutes. :(

The most important thing is using the right tool for the job.  If you
have to introduce a command substitution for each line read with mapfile,
you probably don't have the problem mapfile is intended to solve:
quickly reading exact copies of lines from a file descriptor into an
array.

If another approach works better, you should use it.

If you're interested in why the mapfile solution is slower, you could
run the loop using a version of bash built for profiling and check
where the time goes.  I believe you'd find that the command substitution
is responsible for much of it, and the rest is due to the significant
increase in memory usage resulting from the 250000-line array (which
also slows down fork and process creation).

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer

Chet Ramey, ITS, CWRU    address@hidden    http://cnswww.cns.cwru.edu/~chet/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]