[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[platform-testers] new snapshot available: grep-2.20.72-d512

From: Jim Meyering
Subject: [platform-testers] new snapshot available: grep-2.20.72-d512
Date: Wed, 29 Oct 2014 11:29:59 -0700

Thanks to many fixes and improvements by Paul Eggert and Norihiro Tanaka,
here is a pre-release snapshot:

grep snapshot:      1.2 MB

Here is the NEWS so far:

** Improvements

  Performance has been greatly improved for searching files containing
  holes, on platforms where lseek's SEEK_DATA flag works efficiently.

  Performance has improved for rejecting data that cannot match even
  the first part of a nontrivial pattern.

  Performance has improved for very long strings in patterns.

  If a file contains data improperly encoded for the current locale,
  and this is discovered before any of the file's contents are output,
  grep now treats the file as binary.

  grep -P no longer reports an error and exits when given invalid UTF-8 data.
  Instead, it considers the data to be non-matching.

** Bug fixes

  grep no longer mishandles patterns that contain \w or \W in multibyte

  grep would fail to count newlines internally when operating in non-UTF8
  multibyte locales, leading it to print potentially many lines that did
  not match.  E.g., the command, "seq 10 | env LC_ALL=zh_CN src/grep -n .."
  would print this:
  implying that the match, "10" was on line 1.
  [bug introduced in grep-2.19]

  grep in a non-UTF8 multibyte locale could mistakenly match in the middle
  of a multibyte character when using a '^'-anchored alternate in a pattern,
  leading it to print non-matching lines.  [bug present since "the beginning"]

  grep -E rejected unmatched ')', instead of treating it like '\)'.
  [bug present since "the beginning"]

** Changes in behavior

  The GREP_OPTIONS environment variable is now obsolescent, and grep
  now warns if it is used.  Please use an alias or script instead.

  In locales with multibyte character encodings other than UTF-8,
  grep -P now reports an error and exits instead of misbehaving.

  When searching binary data, grep now may treat non-text bytes as
  line terminators.  This can boost performance significantly.

  grep -z no longer automatically treats the byte '\200' as binary data.

Changes in grep since v2.20:

Jim Meyering (13):
      maint: post-release administrivia
      build: don't redirect directly to $@
      build: improve rule to generate egrep+fgrep scripts
      maint: generate distributed THANKS from VC'd
      doc: update HACKING
      maint: split long lines, and enforce the 80-column limit
      maint: avoid distcheck failure
      tests: add expect-to-fail test for a glibc regexp bug
      doc: move NEWS note about GREP_OPTIONS into proper section
      maint: suppress a false-positive -Wcast-align warning
      grep: avoid stack buffer read-underrun and overrun
      tests: make new test script executable
      gnulib: update to latest; bootstrap, too

Norihiro Tanaka (13):
      dfa: speed-up at initial state
      dfa: separate dfaexec function to help optimization by compiler
      grep: fix subscript error when testing whether empty lines match
      dfa: check end of input buffer after transition in non-UTF8
multibyte locale
      dfa: factor out a new nontrivial block of duplicated code
      dfa: test for just-fixed bug
      dfa: fix a theoretical bug
      grep: initialize validation_boundary properly before use
      dfa: process all MBCSET constructs via glibc's matcher
      dfa: remove two erroneous clauses from a now-unused function
      tests: add test for grep -P fix
      dfa: avoid false match in a non-UTF8 multibyte locale
      dfa: make \w and \W work in multibyte locales

Paul Eggert (46):
      build: update gnulib submodule to latest
      grep: use system strstr if available and fast
      grep: undo part of previous change
      doc: use gnulib fdl module
      maint: remove grep.spec
      build: don't make output files read-only
      build: avoid -Wstack-protector
      grep: with -E, unmatched ')' matches itself
      doc: Document -r vs --exclude more carefully.
      doc: prefer @env to @code
      doc: document LANGUAGE
      grep: fix integer-width bugs in undossify_input etc.
      grep: -P now treats invalid UTF-8 input as non-matching
      grep: port recent fix to older pcre version
      grep: fix false matches with -P '...$' and invalid UTF-8
      grep: fix false matches with -P '...$' and invalid UTF-8
      doc: bug tracker has moved to
      grep: make GREP_OPTIONS obsolescent
      grep: diagnose -P in non-UTF-8 multibyte locale
      grep: remove/refactor unnecessary code about line splitting
      grep: speed up -P on files containing many multibyte errors
      grep: use bool for boolean in grep.c
      grep: treat a file as binary if its prefix contains encoding errors
      grep: improve performance for older glibc
      grep: use mbclen cache more effectively
      grep: avoid false alarms for mb_clen and to_uchar
      grep: use mbclen cache in one more place
      grep: port -P speedup to hosts lacking PCRE_STUDY_JIT_COMPILE
      grep: fix -P speedup bug with empty match
      grep: refactor binary-vs-unknown-vs-text flags for clarity
      grep: -z no longer considers '\200' to be binary data
      grep: non-text bytes in binary data may be treated as line ends
      grep: minor -P speedup with jit_stack
      grep: improve -P performance in typical cases
      grep: skip past holes efficiently
      grep: port to platforms lacking SEEK_DATA
      grep: speed up processing of holes before EOF on Solaris
      grep: scan for valid multibyte strings more quickly
      grep: don't check extensively for invalid prefix bytes unless -P
      maint: generalize the -Wcast-align fix
      dfa: minor tweaks, mostly to remove __attribute__ ((noinline))
      doc: clarify exit status
      doc: modernize and simplify man page
      grep: fix off-by-one bug in -P optimization
      grep: fix grep -P crash
      tests: work around older libpcre bugs when testing -P and UTF-8

Changes in gnulib since v2.20:

* gnulib 98ca2c0...8415b67 (95):
  > socketlib, sockets, sys_socket: Use AC_REQUIRE to pacify autoconf.
  > iconv: avoid false detection of non-working iconv
  > bootstrap: print more diagnostics for missing programs
  > bootstrap: only update the gnulib submodule
  > symlinkat: port to AIX 7.1
  > readlinkat: port to AIX 7.1
  > remove spurious {
  > modules/fcntl: fix error reporting by dupfd
  > basename, dirname: Improve documentation.
  > exclude: declare exclude_patopts static
  > autoupdate
  > dirname: support compilation with C++
  > qsort_r: include <config.h>
  > avltree-list: avoid compiler warnings
  > qsort_r: new module, for GNU-style qsort_r
  > strerror_r-posix: support compilation with C++
  > fcntl-h: fix compilation with Intel C++ compiler
  > autoupdate
  > mountlist: use /proc/self/mountinfo when available
  > users.txt: add cmogstored
  > gnulib-tool: Sync with build-aux/bootstrap options
  > gnulib-tool: Fallback to wget when rsync fails
  > maintainer-makefile: add syntax check for useless ';;'
  > pthread, pthread_sigmask, threadlib: port to Ubuntu 14.04
  > error: drop spurious semicolon
  > gnulib-common.m4: port to GCC 4.2.1 and Sun Studio 12 C++
  > manywarnings: add GCC 4.9 warnings
  > vasnprintf: fix bugs in width computation
  > vasnprintf: Avoid signed/unsigned comparison warning.
  > parse-datetime: Avoid signed/unsigned comparison warning
  > qsort_r: new module, for GNU-style qsort_r
  > vla: new module
  > localename: make gl_locale_name_thread really thread-safe on Windows
  > getpass: don't assume struct termios
  > getdtablesize: fall back on sysconf (_SC_OPEN_MAX)
  > vararrays: modernize AC_C_VARARRAYS for C11
  > relocatable-prog-wrapper: port gettext to OS X 10.8 + GCC 4.8.1
  > sys_select: fix FD_ZERO problem on Solaris 10
  > accept: document Solaris 10 type glitch
  > extern-inline: port to FreeBSD, DragonFly
  > autoupdate
  > Use consistent style to check DEBUG macro in regex_internal.c
  > openat-die: use _Noreturn markup
  > test-open: port to cygwin, which lacks Fortify
  > localename: Enforce declarations before statements.
  > test-userspec: don't look up numeric user names
  > localcharset, localename: MS-Windows support for non-default locales
  > announce-gen: avoid failure when Digest::SHA is installed
  > gettext: revert "update macros to version 0.19"
  > regex: don't deref NULL upon heap allocation failure
  > give projects more flexibilty in set_prog_name arguments
  > regex: fix memory leak in compiler
  > announce-gen: avoid perl warnings
  > localename: avoid -Wsuggest-attribute={const,pure} warnings
  > nl_langinfo: Fix last change.
  > Define macros for glibc
  > Sync up error.c with glibc
  > nl_langinfo: fix build under mingw
  > mountlist: do not classify a bind-mounted dir entry as "dummy"
  > less syntax-check noise when SIGPIPE is ignored
  > nl_langinfo: CODESET on MS-Windows and more items from localeconv
  > Bruno Haible has stepped down as maintainer.
  > mktime: merge #if/#ifdef usage from glibc
  > git-version-gen: improve option descriptions
  > regex: fix memory leak in compiler
  > regex: merge patch from libc
  > acl: port to gcc -Wredundant-decls
  > parse-duration: eliminate 68-year duration limit
  > pthread: don't assume AC_CANONICAL_HOST, port better to Solaris, etc.
  > pthread: define thread-safe macros on some platforms
  > regex: don't be multithreaded if USE_UNLOCKED_IO.
  > gettext: update macros to version 0.19
  > select,poll: fix console handle check on windows 8
  > select: fix waiting on anonymous pipes on MS-Windows
  > times: fix to return non constant value on MS-Windows
  > isatty: fix to work on windows 8
  > maint: fix typo in fdl.texi
  > mountlist: avoid hasmntopt const type warning on solaris
  > maintainer-makefile: delete obsolete code
  > maintainer-makefile: avoid spurious error messages
  > rename: avoid unused-but-set-variable compiler warning
  > maint: add ChangeLog entry missing in previous commit
  > rename: mark a label as potentially unused
  > gnulib-common.m4: Fix typo in _GL_UNUSED_LABEL.
  > acl: apply pure attribute to two functions
  > gnulib-common.m4: add _GL_UNUSED_LABEL
  > dup2, fcntl, fcntl-h: port to AIX 7.1
  > printf, config.rpath: Port to FreeBSD 10.
  > ftoastr: work around compiler bug in IBM xlc 12.1
  > valgrind-tests: fixed misleading help message
  > isfinite, isinf, isnan tests: fix for little-endian PowerPC
  > exclude-tests: port to AIX 7.1
  > pthread_sigmask, timer-time: use gl_THREADLIB only if needed
  > gnulib-tool: wget translations using --no-verbose rather than --quiet
  > gnulib-tool: adjust translation wget to avoid a https redirection

reply via email to

[Prev in Thread] Current Thread [Next in Thread]