coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

cut with multibyte support for delimiter


From: Sebastian Kisela
Subject: cut with multibyte support for delimiter
Date: Mon, 18 Sep 2017 16:25:56 +0200

Hi,

based on the discussion at:
https://lists.gnu.org/archive/html/coreutils/2017-08/msg00029.html

I implemented cut functionality with multibyte delimiter (cut -d'\unicode'
-f ) support using string (char*) and adding another function, to avoid
compatibility issues with "wchar_t".

I have not used any sofisticated error checking concerning the delimiter
value, as giving a wrong value leads to leaving the input "as-is" to the
output. Only checking if there are not multiple delimiters as well as
checking if the current locale is utf8.

I decided to accept only utf8 locales so far, when dealing with multibyte
delimiter as I agree with Assaf and Pádraig, that having utf8 support is a
better option than having the current state.

I added some tests by adding modified tests from 'cut.pl'.

I will be thankful for any feedback.

Enjoy the week!
Sebastián.

Attachment: cut-mb-delim-support.tar.gz
Description: GNU Zip compressed data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]