coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: performance bug of `wc -m`


From: Bruno Haible
Subject: Re: performance bug of `wc -m`
Date: Mon, 21 May 2018 01:43:01 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-124-generic; KDE/5.18.0; x86_64; ; )

Kaz Kylheku wrote in 
https://lists.gnu.org/archive/html/coreutils/2018-05/msg00036.html :

> In what situation are there printable characters in the range [0, UCHAR_MAX) 
> that
> have a width > 1?

That's the wrong question. The question is which characters in this range
have width > 1 or <= 0.

The program below shows that the answer (on a glibc system) is:
The character 0x00AD (= SOFT HYPHEN) is printable but has width == 0.

But if you constrain yourself to the range [0x00, 0x7F], i.e. the ASCII range,
then the "not printable" condition is equivalent to "width <= 0".

Bruno


======================================================================
#define _GNU_SOURCE 1
#include <locale.h>
#include <stdio.h>
#include <wchar.h>
#include <wctype.h>

int
main (void)
{
  unsigned int i;

  setlocale (LC_ALL, "en_US.UTF-8");

  for (i = 0; i < 0x100; i++)
    printf ("0x%02x  %d %d\n", i, iswprint (i) != 0, wcwidth (i));

  return 0;
}




reply via email to

[Prev in Thread] Current Thread [Next in Thread]