[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 printf string formating problem

From: Chris Down
Subject: Re: UTF-8 printf string formating problem
Date: Sun, 6 Apr 2014 13:42:47 +0800
User-agent: Mutt/1.5.23 (2014-03-12)

Jan Novak writes:
> printf string format counts bytes instead of chars, which leads to
> broken output

According to POSIX, printf's field width control is strictly in bytes,
not characters.[0]

> field width:
>     An optional string of decimal digits to specify a minimum field
>     width.  For an output field, if the converted value has fewer
>     bytes than the field width, it shall be padded on the left (or
>     right, if the left-adjustment flag ( '-' ), described below, has
>     been given) to the field width.

By that definition, this is expected behaviour. You will also find this
behaviour in pretty much any POSIX-y tool that uses format strings
(mawk/gawk do it).

I don't have much of an opinion on whether this behaviour is right or
wrong in the context of bash, but if this behaviour is changed, I think
it should be done under another format character, rather than changing
%s (or changing behaviour when not in POSIX-compliance mode).

0: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html

Attachment: pgpWh0K98qVi8.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]