sed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Filtering out carriage returns from terminal progress indicators


From: R. Diez
Subject: Filtering out carriage returns from terminal progress indicators
Date: Sun, 17 Feb 2019 11:15:04 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

Hi there:

I actually know next to nothing about sed. But I still would like to benefit 
from it. How cheeky of me. 8-)

I have written a convenient script that runs a command and tee's all output to 
a log file. It is here:

https://github.com/rdiez/Tools/tree/master/Background

I want the following feature for this script:

"The log file should optimise away the carriage return trick often used to update a 
progress indicator in place on the current console line."

For detailed information about what I like to filter out, please read the long 
comment above this line on my script:

declare -r FILTER_WITH_COL=false

I have tried implementing this filtering by piping all text output to the 
following command:

col -b -p -x

That is the standard solution, and it works well for short operations. However, I just tried copying a large amount of files with rsync, and that generates loads of progress updates on a single text line. And I mean loads of them.

It turns out that 'col' reads whole lines first (until a line feed character), and filters out the carriage returns afterwards. This means that it can use a lot of memory if the lines become very long while the progress indicator is running. And I have just hit this case with my rsync progress indicator:

col: Cannot allocate memory

All other similar solutions I have seen suffer from this problem. For example, the following drops all fields except the last one. By defining the field separator as \r, we achieve a similar filtering effect. Not only it does not handle backspace characters (\b) like 'col' does, but it still consumes too much memory if lines get too long:

awk -F'\r' '{print $NF}'

I have seen that "sed" processes data in a stream manner. That is exactly what 
I need then.

I need to read until the first \r or \n, whichever comes first. I think that it does not usually matter is this chunk gets very long. If a \n came in, then I want to output the chunk. If a \r came in, I want to discard it.

It would not be a perfect filtering, but it could be enough. This is a 
problematic case:

echo $'123\rab\n'

The result should be:

ab3

But the filtering I am proposing would output:

ab

Such filtering would also not deal with backspace characters (\b), which are often used for progress indication too. But it would still be better than nothing.

Can sed do that kind of filtering? Would it be too slow if the progress indication changes often? Is there a finished sed script already? I do not think I can do that myself.

Thanks in advance,
  rdiez



reply via email to

[Prev in Thread] Current Thread [Next in Thread]