bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54388: printf doesn't handle multi-byte values


From: Pádraig Brady
Subject: bug#54388: printf doesn't handle multi-byte values
Date: Mon, 14 Mar 2022 15:38:23 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:97.0) Gecko/20100101 Thunderbird/97.0

On 14/03/2022 03:27, Christoph Anton Mitterer wrote:
Hey Pádraig.

I just wanted to ask, whether the following could be a bug in printf:

POSIX says[0], that e.g.:
    printf '%d\n' \"3
should give the numeric value of the character, and that "in a locale
with multi-byte characters, the value of a character is intended to be
the value of the equivalent of the wchar_t representation of the
character".

In bash:
$ printf '%d\n' $'"\u2208'
8712

here the printf is bash's built-in printf, and there it works.


But using GNU coreutils' printf (version 8.32):
$ /usr/bin/printf '%d\n' $'"\u2208'
/usr/bin/printf: warning: ��: character(s) following character constant have 
been ignored
226


Do I have some wrong assumptions or should I report that as a bug?


Thanks,
Chris.


[0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html

This is a limitation of current coreutils printf that only handles single byte 
chars currently.
This email will open an issue in our bug tracker.

To summarize:
$ ord() { printf "0x%x\n" "'$1"; }  # bash's printf
$ ord 3
0x33
$ ord $'\u2208'
0x2208

$ ord() { env printf "0x%x\n" "'$1"; }  # coreutils' printf
$ ord 3
0x33
$ ord $'\u2208'
0xprintf: warning: ��: character(s) following character constant have been 
ignored
e2

cheers,
Pádraig





reply via email to

[Prev in Thread] Current Thread [Next in Thread]