bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM


From: Eli Zaretskii
Subject: bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM
Date: Wed, 12 May 2021 16:50:15 +0300

> From: Glenn Morris <rgm@gnu.org>
> Cc: Andreas Schwab <schwab@linux-m68k.org>,  48324@debbugs.gnu.org,  
> rdiezmail-emacs@yahoo.de,  larsi@gnus.org
> Date: Tue, 11 May 2021 16:37:51 -0400
> 
> Eli Zaretskii wrote:
> 
> > This should be now fixed on the master branch.
> 
> The change to encode-coding-char in f3f1947e5b5b causes
> test subr-string-limit-coding to fail. Ref eg
> https://hydra.nixos.org/build/142879118

Thanks, I fixed that.

The original test results seemed strange, to say the least: it's as if
we shoot first and draw the target later so that it fits.  E.g., how
can the last 4 bytes of encoding "foóá" with UTF-16 be
"\376\377\000\341", with the 2 first bytes coming from the BOM?

This actually reveals a design flaw in string-limit: we cannot simply
use encode-coding-char to encode the characters one by one.  I added a
FIXME comment to explain why, as I don't currently have any clever
ideas for how to implement it more correctly, except by iterations,
which is inelegant.  Ideas welcome.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]