bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: international alnum characters not part of words even with locale se


From: Chet Ramey
Subject: Re: international alnum characters not part of words even with locale set
Date: Thu, 2 Jan 2003 10:41:34 -0500

> The bindable functions backward-word and forward-word can be used
> to move across word boundaries. However, these functions does not
> seem to care about the current locale (LC_CTYPE).
> 
> For example, I have 'export LC_CTYPE=en_GB' in my .bash_profile.
> This locale supports ISO8859-1 characters. Type 'cat lösenord'
> in bash, and press ctrl+left or whatever key that is bound to
> backward-word. The cursor will now move to the 's', because it
> thinks 'ö' is a word delimiter. This happen in bash as well in
> other readline programs. (ö has byte-value 246.)
> 
> I also have these in my .inputrc:
> 
>   set output-meta on
>   set input-meta on
>   set convert-meta off
> 
> I think the problem is in chardefs.h, in the definition of NON_NEGATIVE:
> 
>   #define NON_NEGATIVE(c) ((unsigned char)(c) == (c))
> 
> (NON_NEGATIVE is used in _rl_lowercase_p, _rl_uppercase_p,
> _rl_pure_alphabetic and ALPHABETIC in the same header file.) This
> does not take into account the fact that input-meta/meta-flag is set
> to true. Maybe this definition would be more correct:
> 
>   #define NON_NEGATIVE(c) (_rl_meta_flag || (unsigned char)(c) == (c))
> 
> (I haven't tested it though.)

The intent of NON_NEGATIVE is to prevent buffer overflows and potential
core dumps with the ctype.h macros.

Consider a system where chars are signed, and the is* ctype.h macros are
implemented using table lookups.  isprint('\200') is therefore equivalent
to isprint(-128), which is undefined and will potentially cause a core
dump.

It's tempting to try to work around this using simple casts to `unsigned char'
when calling the is* macros, but this mishandles EOF:  in the case where
c == EOF, (unsigned char) (c) == 255, which is a printable character in many
locales.

Your patch, while it may solve your particular problem, re-exposes the
problem NON_NEGATIVE was intended to solve.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
( ``Discere est Dolere'' -- chet )

Chet Ramey, ITS, CWRU    chet@po.CWRU.Edu    http://cnswww.cns.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]