bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "Segmentation fault" when input contains embedded NUL characters


From: Shawn Wagner
Subject: Re: "Segmentation fault" when input contains embedded NUL characters
Date: Tue, 10 Nov 2020 06:16:48 -0800

Assaf hasn't replied to anything I've sent out to the mailing list since April; I've been thinking about making another effort to reach out to him and if I can't get a response, maybe talk to the GNU folk about what's involved in taking up maintainership.

On Tue, Nov 10, 2020 at 5:28 AM Catalin Patulea <cat@vv.carleton.ca> wrote:
Hello,

$ datamash  --version
datamash (GNU datamash) 1.6

$ dd if=/dev/zero bs=100 count=1 | datamash countunique 1
1+0 records in
1+0 records out
100 bytes copied, 0.000125612 s, 796 kB/s
Segmentation fault

backtrace:

(gdb) bt
#0  0x000055555555c95c in field_op_get_string_ptrs (op=0x55555557a5f0, sort_case_sensitive=sort_case_sensitive@entry=true, sort=true)
    at src/field-ops.c:278
#1  0x000055555555d194 in count_unique_values (op=<optimized out>, case_sensitive=true) at src/field-ops.c:640
#2  0x000055555555d5ec in field_op_summarize (op=0x55555557a5f0) at src/field-ops.c:963
#3  0x000055555555f5cb in summarize_field_ops () at src/datamash.c:539
#4  0x000055555555f88a in process_group (line=0x7fffffffe340) at src/datamash.c:589
#5  0x000055555555fab7 in process_file () at src/datamash.c:651
#6  0x000055555555786b in main (argc=<optimized out>, argv=0x7fffffffe5c8) at src/datamash.c:1291
(gdb) fra 1
#1  0x000055555555d194 in count_unique_values (op=<optimized out>, case_sensitive=true) at src/field-ops.c:640
640 in src/field-ops.c
(gdb) fra 0
#0  0x000055555555c95c in field_op_get_string_ptrs (op=0x55555557a5f0, sort_case_sensitive=sort_case_sensitive@entry=true, sort=true)
    at src/field-ops.c:278
278 in src/field-ops.c

Simply, field_op_get_string_ptrs, and probably datamash in general, assumes input will not contain embedded NULs:
https://github.com/agordon/datamash/blob/v1.6/src/field-ops.c#L279

For my application, the embedded NULs are an accident, and I can resolve that and resume using datamash. datamash does not need to support inputs with embedded NULs. But it should not crash on such inputs, either. Perhaps output a message warning the user that such inputs are not supported.

I was considering writing a patch. Is the github repository actively watched for pull requests? I noticed many patches on the mailing list awaiting review.

Thanks for a great tool!

Catalin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]