bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31526: Range [a-z] does not follow collate order from locale.


From: Bize Ma
Subject: bug#31526: Range [a-z] does not follow collate order from locale.
Date: Fri, 18 May 2018 17:58:05 -0400

Package: sed
Version: 4.4-2
Severity: important

Dear Maintainer,

With a locale set to en_US.utf8 it is expected that the collating order is
this:

    $ printf '%b' $(printf '\\U%x\\n' {32..127}) | sort | tr -d '\n'
    `^~<=>| _-,;:!?/.'"()address@hidden&#%+0123456789aAbBcCdDeEfFgGhHiIjJ
kKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ

It is expected that a range [a-z] will match 'aAbBcCdD…', all lower and
upper letters.
But it isn't:

    $ printf '%b' $(printf '\\U%x' {32..127}) | sed 's/[^a-z]//g'
    abcdefghijklmnopqrstuvwxyz

However, the range [a-Z] does match all letters, lower or upper:

    $ printf '%b' $(printf '\\U%x' {32..127}) | sed 's/[^a-Z]//g'
    ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz

If this is the correct way in which sed should work, then, if you please:

    - What is the rationale leading to such decision?.
    - Where is it documented?.
    - Where is it implemented in the code?.
    - Why does the manual document otherwise?.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]