Re: Erroneous assumption in isblank.c

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Erroneous assumption in isblank.c

From:	Bruno Haible
Subject:	Re: Erroneous assumption in isblank.c
Date:	Tue, 5 Oct 2010 11:17:38 +0200
User-agent:	KMail/1.9.9

Hi,

John Darrington wrote:
> In lib/isblank.c I see the following:
> 
>  /* The "blank" characters are '\t', ' ',
>      U+1680, U+180E, U+2000..U+2006, U+2008..U+200A, U+205F, U+3000, and none
>      except the first two is present in a common 8-bit encoding.  Therefore
>      the substitute for other platforms is not more complicated than this.  */
>   return (c == ' ' || c == '\t');
> 
> This is incorrect.  In iso-8859-1 (a very common 8-bit encoding), U+00A0 is 
> the
> non-breaking-space character.  

U+00A0 NO-BREAK SPACE is a glyph that carries no ink, but that is like a
non-blank punctuation character for other respects. In particular, its very
definition is that, unlike U+0020 SPACE, it is not an opportunity for line
breaking.

The function isblank() is not used in graphical rendering engines; it is used
in programs that do line breaking, such as 'fold':
coreutils/src/fold.c:178:                  if (isblank (to_uchar 
(line_out[logical_end])))
For this reason, isblank(U+00A0) *must* return false. Otherwise many programs
would treat is like U+0020 SPACE.

Bruno

[Prev in Thread]

Current Thread

[Next in Thread]

Erroneous assumption in isblank.c, John Darrington, 2010/10/05
- Re: Erroneous assumption in isblank.c, Bruno Haible <=

Prev by Date: Erroneous assumption in isblank.c
Next by Date: Re: -no-undefined (was: libposix)
Previous by thread: Erroneous assumption in isblank.c
Next by thread: bootstrap and pkg-config [was: [libvirt] OSX 10.6 build failures]
Index(es):
- Date
- Thread