emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#47702: closed (wc man page: first you are talking about bytes, then


From: GNU bug Tracking System
Subject: bug#47702: closed (wc man page: first you are talking about bytes, then you are talking about characters)
Date: Sun, 11 Apr 2021 15:51:02 +0000

Your message dated Sun, 11 Apr 2021 16:50:35 +0100
with message-id <e2727073-311a-5a02-e093-cd397a12611d@draigBrady.com>
and subject line Re: bug#47702: wc man page: first you are talking about bytes, 
then you are talking about characters
has caused the debbugs.gnu.org bug report #47702,
regarding wc man page: first you are talking about bytes, then you are talking 
about characters
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
47702: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=47702
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
--- Begin Message --- Subject: wc man page: first you are talking about bytes, then you are talking about characters Date: Sun, 11 Apr 2021 09:42:57 +0800
Man wc says

       Print newline, word, and byte counts for each FILE, and a total line if
       more than one FILE is specified.  A word is a non-zero-length  sequence
       of characters delimited by white space.

first you are talking about bytes, then you are talking about
characters.

So for the latter, please say
characters (not bytes)
or
characters (same as bytes)
or just
bytes
Yes, even if explained in the INFO file.
Thanks.



--- End Message ---
--- Begin Message --- Subject: Re: bug#47702: wc man page: first you are talking about bytes, then you are talking about characters Date: Sun, 11 Apr 2021 16:50:35 +0100 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0
On 11/04/2021 02:42, 積丹尼 Dan Jacobson wrote:
Man wc says

        Print newline, word, and byte counts for each FILE, and a total line if
        more than one FILE is specified.  A word is a non-zero-length  sequence
        of characters delimited by white space.

first you are talking about bytes, then you are talking about
characters.

So for the latter, please say
characters (not bytes)
or
characters (same as bytes)
or just
bytes
Yes, even if explained in the INFO file.

You're right that this is under-specified,
in both the man page and the info file.
The above is really characters (not bytes).
In fact as a GNU extension it's printable characters.
POSIX does not specify this, but one can confirm like:


$ printf '\xc3 \xc3' | LC_ALL=C wc --word --character --byte
      0       3       3
$ printf '\xc3 \xc3' | LC_ALL=C.utf8 wc --word --character --byte
      0       1       3

The info file was really quite under-specified in this regard.
I'll apply the attached to clarify things.
Marking this as done.

thanks!
Pádraig

Attachment: wc-clarify-counts.patch
Description: Text Data


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]