[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

From: Bruno Haible
Subject: Re: Alignment bug in ls with UTF-8 filenames under Mac OS X
Date: Wed, 17 Jan 2007 23:01:52 +0100

Eric Blake wrote:
> coreutils does not handle multi-byte locales well.


> The problem is that no one has yet written a patch that makes it
> easy to handle multibyte locales without penalizing single-byte locales.

There are patches for multibyte locale support for many of the text
utilities, written in 2001. They are based on the mbchar and mbiter
modules that are now in gnulib. But regardless how they were written,
Jim preferred not to use them:

  - If the code used multibyte functions always, it was too much of a
    slowdown compared to the older implementation that worked only for
    unibyte locales. Everyone agreed on this.

  - If the code used an

      if (MB_CUR_MAX > 1)
        ... code which uses mb* functions ...
        ... unibyte code ...

    Jim objected that there was too much code duplication between the
    multibyte and the unibyte branch.

  - If the code used macros that can expand to multibyte or unibyte
    primitives, depending on the situation, one could put the code
    that uses these macros into a separate file, say,
    fold-subroutines.h, and in the main fold.c write

       #define DO_MULTIBYTE 1
       #include "fold-subroutines.h" /* defines fold_multibyte */

       #define DO_UNIBYTE 1
       #include "fold-subroutines.h" /* defines fold_unibyte */

       if (MB_CUR_MAX > 1)
         fold_multibyte (...);
         fold_unibyte (...);

    Here Jim said that it was too many macros for him.

There has been no progress since then, since noone sees how one can
get all 3 of Jim's requirements simultaneously:

  - Good speed for the unibyte case.
  - No code duplication.
  - No macros.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]