[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
multibyte support (round 3)
From: |
Assaf Gordon |
Subject: |
multibyte support (round 3) |
Date: |
Mon, 19 Sep 2016 02:11:29 -0400 |
Hello,
Updated patch attached.
Improvements from last time (
http://lists.gnu.org/archive/html/coreutils/2016-09/msg00011.html ):
1. 'multibyte' and 'mbbuffer' are in gl/ , behave more like gnulib modules.
Tests cover all items mentioned in Markus Kuhn's UTF-8 decoder page
(https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt).
2. cygwin/UTF-16 surrogates are handled transparently in 'mbbuffer'.
Applications under cygwin see 'ucs4_t' and don't need to worry about surrogates
(but, wcwidth() will present some problem). Tests ensure parsing under cygwin
behaves like other systems.
3. 'cut' supports multibyte '-c' and '-n -b' (but not multibyte '-d' yet).
Some tests included.
Comments welcomed,
- assaf
multibyte-2016-09-19.patch.xz
Description: Binary data
- multibyte support (round 3),
Assaf Gordon <=