bug#32704: Can grep search for a line feed and a null character at the s

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32704: Can grep search for a line feed and a null character at the s

From:	Eric Blake
Subject:	bug#32704: Can grep search for a line feed and a null character at the same time?
Date:	Tue, 11 Sep 2018 12:03:17 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0

On 9/11/18 11:25 AM, address@hidden wrote:

Hello,
I found someone who asked the same question on “Stack Overflow”, stillunanswered, but this person did not ask it on the mailing list.
Here are the details of the question which are nearly similar to my case:
https://stackoverflow.com/questions/50295772/can-grep-search-for-a-line-feed-and-a-null-character-at-the-same-time


Per 'info grep':

  15. How can I match across lines?

     Standard grep cannot do this, as it is fundamentally line-based.
     Therefore, merely using the ‘[:space:]’ character class does not
     match newlines in the way you might expect.

     With the GNU ‘grep’ option ‘-z’ (‘--null-data’), each input and
     output “line” is null-terminated; *note Other Options::.  Thus, you
     can match newlines in the input, but typically if there is a match
     the entire input is output, so this usage is often combined with
     output-suppressing options like ‘-q’, e.g.:

          printf 'foo\nbar\n' | grep -z -q 'foo[[:space:]]\+bar'

     If this does not suffice, you can transform the input before giving
     it to ‘grep’, or turn to ‘awk’, ‘sed’, ‘perl’, or many other
     utilities that are designed to operate across lines.

Grep does not have the ability to match hex or octal backslashsequences, and a literal newline in the pattern is taken as a separationof patterns. Use of [:space:] to include newline alongside other thingssort of works. But maybe we really do have a bug - when -z is ineffect, I'd expect NUL, rather than newline, to be the byte thatseparates separate patterns in the pattern argument (and thus expressinga literal newline, as in shells that understand $'\n$', to be viable forwriting a single pattern that matches exactly one newline byte at theend of a NUL-separated record).

That said, your EASIEST approach is to use iconv to recode your file outof UTF-16 (which is NOT conducive to multi-byte processing), intosomething friendlier like UTF-8, and then use grep on the converted file.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

bug#32704: Can grep search for a line feed and a null character at the same time?, 21naown, 2018/09/11
- bug#32704: Can grep search for a line feed and a null character at the same time?, Eric Blake <=
  - bug#32704: Can grep search for a line feed and a null character at the same time?, Paul Eggert, 2018/09/11
    - bug#32704: Can grep search for a line feed and a null character at the same time?, Eric Blake, 2018/09/11
    - bug#32704: Can grep search for a line feed and a null character at the same time?, 21naown, 2018/09/15
    - bug#32704: Can grep search for a line feed and a null character at the same time?, Eric Blake, 2018/09/15
    - bug#32704: Can grep search for a line feed and a null character at the same time?, 21naown, 2018/09/15
    - bug#32704: Can grep search for a line feed and a null character at the same time?, Assaf Gordon, 2018/09/15
    - bug#32704: Can grep search for a line feed and a null character at the same time?, Eric Blake, 2018/09/15
    - bug#32704: Can grep search for a line feed and a null character at the same time?, 21naown, 2018/09/17

Prev by Date: bug#32704: Can grep search for a line feed and a null character at the same time?
Next by Date: bug#32704: Can grep search for a line feed and a null character at the same time?
Previous by thread: bug#32704: Can grep search for a line feed and a null character at the same time?
Next by thread: bug#32704: Can grep search for a line feed and a null character at the same time?
Index(es):
- Date
- Thread