[bug #28275] Ranges like [a-z] incorrectly match in UTF systems

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #28275] Ranges like [a-z] incorrectly match in UTF systems

From:	Makar
Subject:	[bug #28275] Ranges like [a-z] incorrectly match in UTF systems
Date:	Sun, 13 Dec 2009 14:06:06 +0000
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; ru-RU; rv:1.9.1.5) Gecko/20091129 Sabayon Firefox/3.5.5

URL:
  <http://savannah.gnu.org/bugs/?28275>

                 Summary: Ranges like [a-z] incorrectly match in UTF systems
                 Project: grep
            Submitted by: tkzv
            Submitted on: Вск 13 Дек 2009 14:06:05
                Category: None
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any

    _______________________________________________________

Details:

In UTF-8 locale if basic or extended regular expressions are selected, ranges
like [a-z] or [а-я] seem to match much more symbols, than they should.
Simply enumerating all the symbols, e.g. [abcdefghijklmnopqrstuvwxyz] or
[абвгдеёжзийклмнопрстуфхцчшщъыьэюя] works
fine.

If perl regular expressions are selected (-P switch), ranges with ASCII-only
symbols like [a-z] work correctly, but multibyte (both ranges and enumeration)
symbols are interpreted as several 1-byte symbols.




    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?28275>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Makar <=
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Paolo Bonzini, 2009/12/14
  - [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Makar, 2009/12/14
    - [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Paolo Bonzini, 2009/12/14
    - [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Makar, 2009/12/14
    - [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Paolo Bonzini, 2009/12/14
    - [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Norihirio Tanaka, 2009/12/16
    - [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Makar, 2009/12/21
    - [bug #28275] grep -P should use PCRE_UTF8, Paolo Bonzini, 2009/12/22

Prev by Date: (GREP) Suggestion to supress warning messages
Next by Date: [bug #28275] Ranges like [a-z] incorrectly match in UTF systems
Previous by thread: (GREP) Suggestion to supress warning messages
Next by thread: [bug #28275] Ranges like [a-z] incorrectly match in UTF systems
Index(es):
- Date
- Thread