bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #28275] Ranges like [a-z] incorrectly match in UTF systems


From: Makar
Subject: [bug #28275] Ranges like [a-z] incorrectly match in UTF systems
Date: Sun, 13 Dec 2009 14:06:06 +0000
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; ru-RU; rv:1.9.1.5) Gecko/20091129 Sabayon Firefox/3.5.5

URL:
  <http://savannah.gnu.org/bugs/?28275>

                 Summary: Ranges like [a-z] incorrectly match in UTF systems
                 Project: grep
            Submitted by: tkzv
            Submitted on: Вск 13 Дек 2009 14:06:05
                Category: None
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any

    _______________________________________________________

Details:

In UTF-8 locale if basic or extended regular expressions are selected, ranges
like [a-z] or [а-я] seem to match much more symbols, than they should.
Simply enumerating all the symbols, e.g. [abcdefghijklmnopqrstuvwxyz] or
[абвгдеёжзийклмнопрстуфхцчшщъыьэюя] works
fine.

If perl regular expressions are selected (-P switch), ranges with ASCII-only
symbols like [a-z] work correctly, but multibyte (both ranges and enumeration)
symbols are interpreted as several 1-byte symbols.




    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?28275>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]