bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18266: grep -P and invalid exits with error


From: Paul Eggert
Subject: bug#18266: grep -P and invalid exits with error
Date: Fri, 29 Aug 2014 06:43:45 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0

Thanks, but that patch seems to depend on libpcre internals, in that it "knows" that pcre_exec cannot possibly succeed without first checking its entire input buffer for invalid UTF-8 bytes. Even if that's true now, it reflects a performance bug that might be fixed in a future libpcre version.

Also, I don't see why grep needs to copy the buffer when there's an encoding error. Why not simply rerun the matcher on the initial prefix that doesn't have an encoding-error byte, and then (if that doesn't find a match), try matching the suffix after the encoding-error byte? This approach would not only avoid the buffer copy, it would avoid knowledge of libpcre internals.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]