[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BUG? RFE? printf lacking unicode support in multiple areas

From: Linda Walsh
Subject: Re: BUG? RFE? printf lacking unicode support in multiple areas
Date: Fri, 20 May 2011 15:03:25 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Thunderbird/ Mnenhy/


Greg Wooledge wrote:
On Fri, May 20, 2011 at 12:31:31AM -0700, Linda Walsh wrote:
1) use of of the \uXXXX and \UXXXXXXXX escape sequences
in the format string (16 and 32 bit Unicode values).

This isn't even a sentence.  What bash command did you execute, and
what did it do, and what did you expect it to do?
        Um...maybe what it does in 4.2?

Even if you had correctly used the $'...' syntax, $'\x3c\x20' is NOT
how you encode U+203C.
 Nor does it have anything to do with %lc,
Your information is invalid.

%lc uses wide chars 'wchar_t or wint_t'. These are 16 bits on Win&cygwin and 32 on with glib.

wchar_t is also defined as 'utf16' (as a type in the include header files
on linux).   That means from the page you so graciously point to:


one would use the UTF-16 value...which is..um...gee, lets see
0x203c.  Gosh, what'ya know!

the UTF-8 encoding of U+203C is E2 80 BC.

Which has nothing to do with the data input taken by the %lc format.

If your terminal encoding is set to UTF8, it SHOULD output UTF-8 -- a multibyte string is specified as the output.

address@hidden:/var/tmp/bash/bash-4.2$ a=$'\xe2\x80\xbc'; printf '%s\n' "$a"

Here the ? is the !! character being pasted across machines into my
vim window where I'm writing this email.  But trust me, it worked.

The gnu version of printf handles the \uXXXX and \UXXXXXXXX
version, but doesn't appear to handle the "%lc" format specifier.

What's that got to do with bash?

Gee, I dunno maybe because it wasn't in my bash and when I did a man of
printf, it showed me those formats so I tried them with printf as my
first test?   Normally bash follows conventions for its builtin utils
as the ones that are not builtin...but you think Bash following
such standards is unreasonable?

What does \u have to do with %lc?
Not much -- except that a a wide char of 0x203c output using %lc
should output the same multi-byte char as \u203c.

Did you get out of the wrong side of the bed? Your response drips with unnecessary hostility.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]