bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Rational Range Interpretation patches, 2/3


From: Aharon Robbins
Subject: Rational Range Interpretation patches, 2/3
Date: Mon, 16 Jan 2012 22:25:41 +0200
User-agent: Heirloom mailx 12.4 7/29/08

>From 366cc2f4170f8dfbaa2137602e4ccc35e854766a Mon Sep 17 00:00:00 2001
From: Arnold D. Robbins <address@hidden>
Date: Mon, 16 Jan 2012 22:07:40 +0200
Subject: [PATCH 2/2] Document Rational Range Interpretation.

---
 doc/grep.texi |   21 ++++++++++++++++-----
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/doc/grep.texi b/doc/grep.texi
index de73d7f..dc27e52 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -939,9 +939,7 @@ They are omitted (i.e., false) by default and become true 
when specified.
 @cindex character type
 @cindex national language support
 @cindex NLS
-These variables specify the locale for the @code{LC_COLLATE} category,
-which determines the collating sequence
-used to interpret range expressions like @samp{[a-z]}.
+These variables specify the locale for the @code{LC_COLLATE} category.
 
 @item LC_ALL
 @itemx LC_CTYPE
@@ -1202,7 +1200,12 @@ For example, the regular expression
 Within a bracket expression, a @dfn{range expression} consists of two
 characters separated by a hyphen.
 It matches any single character that
-sorts between the two characters, inclusive, using the locale's
+sorts between the two characters, inclusive,
+using the machine's character set.
+
+Up to and including version 2.10 of @command{grep},
+range expressions would match any single character that sorted between
+the two characters, inclusive, using the current locale's
 collating sequence and character set.
 For example, in the default C
 locale, @samp{[a-d]} is equivalent to @samp{[abcd]}.
@@ -1211,9 +1214,17 @@ characters in dictionary order, and in these locales 
@samp{[a-d]} is
 typically not equivalent to @samp{[abcd]};
 it might be equivalent to @samp{[aBbCcDd]}, for example.
 To obtain the traditional interpretation
-of bracket expressions, you can use the @samp{C} locale by setting the
+of bracket expressions, it was necessary to use the @samp{C} locale
+by setting the
 @env{LC_ALL} environment variable to the value @samp{C}.
 
+Since the current POSIX standard now makes the behavior of range expressions
+be implementation-defined, instead of requiring the locale's
+collating order, @command{grep} has reverted to the traditional Unix
+behavior of defining ranges based on the machine character address@hidden
+is known as ``Rational Range Interpretation,'' a lovely phrase
+coined by Karl Berry.}
+
 Finally, certain named classes of characters are predefined within
 bracket expressions, as follows.
 Their interpretation depends on the @code{LC_CTYPE} locale;
-- 
1.7.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]