Re: character ranges in regular expressions

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character ranges in regular expressions

From:	Bruno Haible
Subject:	Re: character ranges in regular expressions
Date:	Thu, 23 Sep 2010 23:55:25 +0200
User-agent:	KMail/1.9.9

Paolo,

> Bruno, ... Can you shed light on what __collseq_table_lookup is supposed 
> to mean?

It is a runtime lookup function into a table that maps Unicode characters to
uint32_t values. For a 'char' value, the most efficient way to implement
a mapping from 'char' to uint32_t is through an array: uint32_t[UCHAR_MAX+1].
For a 'wchar_t' value whose width is up to 21 bits, the data structure we
use in glibc (and also in gnulib / libunistring) is a 3-level lookup table.
See the file locale/programs/3level.h for details.

In regcomp.c and regexec.c the _NL_COLLATE_COLLSEQWC field of the LC_COLLATE
part of the locale is encoded in this way. In glibc/locale/programs/ld-collate.c
this field is being constructed from a table called 'collate->wcseqorder'.
The role of this table is to be used in regular expression matching and
wildcard matching. The table is derived from (but does not represent the
entire information from) the LC_COLLATE portion of the locale input file.

Bruno

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 0/2] process range expressions consistently with system regex, Paolo Bonzini, 2010/09/21
- [PATCH 1/2] dfa: process range expressions consistently with system regex, Paolo Bonzini, 2010/09/21
  - Re: [PATCH 1/2] dfa: process range expressions consistently with system regex, Paolo Bonzini, 2010/09/22
- [PATCH 2/2] tests: add testcase for previous fix, Paolo Bonzini, 2010/09/21
  - Re: [PATCH 2/2] tests: add testcase for previous fix, Jim Meyering, 2010/09/23
    - Re: [PATCH 2/2] tests: add testcase for previous fix, Paolo Bonzini, 2010/09/23
    - Re: [PATCH 2/2] tests: add testcase for previous fix, Jim Meyering, 2010/09/23
    - Re: [PATCH 2/2] tests: add testcase for previous fix, Paul Eggert, 2010/09/23
    - Re: [PATCH 2/2] tests: add testcase for previous fix, Paolo Bonzini, 2010/09/23
    - Re: character ranges in regular expressions, Bruno Haible <=
    - Re: character ranges in regular expressions, Paolo Bonzini, 2010/09/24
    - Re: character ranges in regular expressions, Bruno Haible, 2010/09/24
    - Re: character ranges in regular expressions, Paolo Bonzini, 2010/09/24
    - Re: character ranges in regular expressions, Bruno Haible, 2010/09/24
    - Re: character ranges in regular expressions, Paul Eggert, 2010/09/24
    - Re: character ranges in regular expressions, Eric Blake, 2010/09/24
- [PATCH 0/2] process range expressions consistently with system regex, Paolo Bonzini, 2010/09/21
  - [PATCH 1/2] dfa: process range expressions consistently with system regex, Paolo Bonzini, 2010/09/21
  - [PATCH 2/2] tests: add testcase for previous fix, Paolo Bonzini, 2010/09/21

Prev by Date: Re: [PATCH 2/2] tests: add testcase for previous fix
Next by Date: Re: character ranges in regular expressions
Previous by thread: Re: [PATCH 2/2] tests: add testcase for previous fix
Next by thread: Re: character ranges in regular expressions
Index(es):
- Date
- Thread