[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `stri
From: |
Richard Hansen |
Subject: |
bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte' |
Date: |
Sun, 5 Jun 2022 22:00:35 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 |
On 6/5/22 01:37, Eli Zaretskii wrote:
Could you please state what is confusing in the current wording?
* "Raw 8-bit bytes" isn't really defined. It's mentioned earlier in
the chapter -- the term is even in a @dfn{} -- but there's no
definition there.
* The term "raw 8-bit bytes" is misleading. It suggests binary data
(bytes with values 0-255) but it's actually meant to only cover
128-255.
* The term "raw 8-bit bytes" is not used consistently. Sometimes "8"
is spelled out as "eight", sometimes "raw" comes after "8-bit",
and sometimes it refers to all byte values 0-255 (see the first
sentence under `@cindex unibyte text`).
* It's not clear whether "raw 8-bit bytes" is meant to refer to
bytes with values 128-255, or to the *characters* that map to
those byte values.
* The following phrasing is weird: "The function assumes that
@var{string} includes ASCII characters and raw 8-bit bytes". The
purpose of "raw 8-bit bytes" is to cover non-ASCII byte values, so
by definition that assumption is always true. By saying "the
function assumes", the reader is left wondering about the cases
where that assumption is not true, which in turn causes the reader
to question whether "raw 8-bit bytes" fully covers non-ASCII byte
values, which in turn causes the reader to wonder how to handle
those non-covered values (whatever they are).
Maybe something like this:
By definition, unibyte strings contain only @acronym{ASCII}
characters (bytes with values 0-127) and raw 8-bit bytes
(bytes with values 128-255); the latter are converted to their
corresponding multibyte representations in the
@code{eight-bit} character set (@pxref{Text Representations,
codepoints}).
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte', Richard Hansen, 2022/06/03
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte', Eli Zaretskii, 2022/06/03
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte', Richard Hansen, 2022/06/03
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte', Eli Zaretskii, 2022/06/04
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte', Richard Hansen, 2022/06/04
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte', Eli Zaretskii, 2022/06/05
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte',
Richard Hansen <=
- bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte', Eli Zaretskii, 2022/06/06