|
From: | Carl Edquist |
Subject: | Re: RFE: enable buffering on null-terminated data |
Date: | Sun, 10 Mar 2024 15:36:32 -0500 (CDT) |
Hi Zack,This sounds like a potentially useful feature (it'd probably belong with a corresponding new buffer mode in setbuf(3)) ...
Filenames should be passed between utilities in a null-terminated fashion, because the null byte is the only byte that can't appear within one.
Out of curiosity, do you have an example command line for your use case?
If I want to buffer output data on null bytes, the closest I can get is 'stdbuf --output=0', which doesn't buffer at all. This is pretty inefficient.
I'm just thinking that find(1), for instance, will end up calling write(2) exactly once per filename (-print or -print0) if run under stdbuf unbuffered, which is the same as you'd get with a corresponding stdbuf line-buffered mode (newline or null-terminated).
It seems that where line buffering improves performance over unbuffered is when there are several calls to (for example) printf(3) in constructing a single line. find(1), and some filters like grep(1), will write a line at a time in unbuffered mode, and thus don't seem to benefit at all from line buffering. On the other hand, cut(1) appears to putchar(3) a byte at a time, which in unbuffered mode will (like you say) be pretty inefficient.
So, depending on your use case, a new null-terminated line buffered option may or may not actually improve efficiency over unbuffered mode.
You can run your commands under strace like stdbuf --output=X strace -c -ewrite command ... | ... to count the number of actual writes for each buffering mode. CarlPS, "find -printf" recognizes a '\c' escape to flush the output, in case that helps. So "find -printf '%p\0\c'" would, for instance, already behave the same as "stdbuf --output=N find -print0" with the new stdbuf output mode you're suggesting.
(Though again, this doesn't actually seem to be any more efficient than running "stdbuf --output=0 find -print0")
On Sun, 10 Mar 2024, Zachary Santer wrote:
Was "stdbuf feature request - line buffering but for null-terminated data" See below. On Sun, Mar 10, 2024 at 5:38 AM Pádraig Brady <P@draigbrady.com> wrote:On 09/03/2024 16:30, Zachary Santer wrote:'stdbuf --output=L' will line-buffer the command's output stream. Pretty useful, but that's looking for newlines. Filenames should be passed between utilities in a null-terminated fashion, because the null byte is the only byte that can't appear within one. If I want to buffer output data on null bytes, the closest I can get is 'stdbuf --output=0', which doesn't buffer at all. This is pretty inefficient. 0 means unbuffered, and Z is already taken for, I guess, zebibytes. --output=N, then? Would this require a change to libc implementations, or is it possible now?This does seem like useful functionality, but it would require support for libc implementations first. cheers, Pádraig
[Prev in Thread] | Current Thread | [Next in Thread] |