[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE
From: |
Eli Zaretskii |
Subject: |
bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE |
Date: |
Sun, 05 Apr 2020 16:28:13 +0300 |
> From: Mattias Engdegård <mattiase@acm.org>
> Date: Sun, 5 Apr 2020 12:14:59 +0200
> Cc: 40407@debbugs.gnu.org
>
> > I think in the use case where we return a copy, we should make sure
> > the return value is unibyte when encoding and multibyte when decoding.
>
> I'm not necessarily opposed to the suggestion, but why not return a unibyte
> string in both cases, simplifying the code?
For compatibility with what happens now:
(multibyte-string-p (decode-coding-string "abc" 'utf-8)) => t
> In addition, some operations (aref) are faster on unibyte. Either way, it's
> nothing that a caller could rely on, is there? (In particular when taking
> NOCOPY into account.)
That is true, of course, but many/most of our strings are multibyte
nowadays, even if they are ASCII. Suddenly getting a unibyte string
instead would be surprising, I think, even if no one should depend on
it not happening. (NOCOPY case is different: then it's the caller's
responsibility to deal with the issue.) So I'd rather we produced a
multibyte string when "decoding" by copying.
> +/* Whether a (unibyte) string only contains chars in the 0..127 range. */
One subtle point regarding this comment: I'd remove the "unibyte"
part, because (1) you apply this test to multibyte strings as well,
and (2) strings encoded in iso-2022 will look "pure-ASCII", but they
aren't. The latter subtlety doesn't interfere with the caller,
because iso-2022 is not ASCII-compatible, but it's something I'd
mention in the comment, lest someone uses this function for some
other use case.
The patch is OK otherwise. Thanks.
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, (continued)
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/03
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/03
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE,
Eli Zaretskii <=
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05