[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
cut with multibyte support for delimiter
From: |
Sebastian Kisela |
Subject: |
cut with multibyte support for delimiter |
Date: |
Mon, 18 Sep 2017 16:25:56 +0200 |
Hi,
based on the discussion at:
https://lists.gnu.org/archive/html/coreutils/2017-08/msg00029.html
I implemented cut functionality with multibyte delimiter (cut -d'\unicode'
-f ) support using string (char*) and adding another function, to avoid
compatibility issues with "wchar_t".
I have not used any sofisticated error checking concerning the delimiter
value, as giving a wrong value leads to leaving the input "as-is" to the
output. Only checking if there are not multiple delimiters as well as
checking if the current locale is utf8.
I decided to accept only utf8 locales so far, when dealing with multibyte
delimiter as I agree with Assaf and Pádraig, that having utf8 support is a
better option than having the current state.
I added some tests by adding modified tests from 'cut.pl'.
I will be thankful for any feedback.
Enjoy the week!
Sebastián.
cut-mb-delim-support.tar.gz
Description: GNU Zip compressed data
- cut with multibyte support for delimiter,
Sebastian Kisela <=