bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32704: Can grep search for a line feed and a null character at the s


From: Eric Blake
Subject: bug#32704: Can grep search for a line feed and a null character at the same time?
Date: Sat, 15 Sep 2018 15:27:08 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0

On 9/15/18 12:57 PM, address@hidden wrote:

But is it at least possible to find “\x0A\x00” with grep?

If you bend the rules by throwing -P into the mix, yes :)

So it is possible to find “\x0A\x00” alone, but for example “\x74\x00\x0D\x00\x0A\x00\x74\x00\x65\00” is impossible to find with the “-P” option?

Correct. It is impossible to find the record terminator in the middle of a pattern, whether that terminator is \n (default) or NUL (-z). It is therefore impossible to find a multi-record match using grep. The string you listed contains both \x00 and \x0a, so regardless of which of those two bytes you pick as the record terminator, it is impossible to use grep to find that substring in your file. You'll have to resort to a tool that supports multiline matching, since grep is not such a tool.

It IS possible, of course, to change your data, for example:

tr '\0' '\xff' < file | grep $modified_pattern | tr '\xff' '\0'

assuming that \xff didn't appear anywhere else in the file; although it may make matching harder if you don't have the right record terminators any longer. Or, if your input data is encoded in UTF-16, it's easiest to convert it into UTF-8 for the grep:

iconv -f UTF-16 -t UTF-8 < file | grep $modified_pattern \
  | iconv -f UTF-8 -t UTF-16

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org





reply via email to

[Prev in Thread] Current Thread [Next in Thread]