bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#28179: Fwd: Re: bug#28179: Fix use of string-to-multibyte in ispell.


From: Eli Zaretskii
Subject: bug#28179: Fwd: Re: bug#28179: Fix use of string-to-multibyte in ispell.el
Date: Thu, 24 Aug 2017 21:20:46 +0300

> Cc: 28179@debbugs.gnu.org
> From: Reuben Thomas <rrt@sc3d.org>
> Date: Thu, 24 Aug 2017 18:45:33 +0100
> 
> The reason I am asking again is because you first said:
> 
> > What if decode-coding-string returns a pure ASCII string, which is
> > therefore unibyte?
> 
> and then later you said:
> 
> > The way I meant it, it has to do with the internal flag marking a
> > string either unibyte or multibyte. Observe:
> >   (multibyte-string-p "abcd") => nil
> >
> > but
> >
> >   (multibyte-string-p (decode-coding-string "abcd" 'utf-8)) => t

That example may be conclusive for UTF-8, but is it conclusive for
_any_ encoding?  I don't know.  E.g., what about the ISO-2022 based
encodings, where all the bytes are (AFAIR) pure ASCII?

> 1. As far as I can tell from the above (and my own confirmatory
> experiments and reading of the documentation), a pure ASCII string can
> be multibyte (it's a matter of the multibyte flag, not the number of
> bytes used to store each character).
> 
> 2. decode-coding-string always returns a multibyte string.

Can you show me why 2 is always correct?  It might be, I simply don't
know.  All I know is that in general relying on plain-ASCII strings to
be always multibyte in any given situation is risky, we were bitten by
that a few times.  But maybe it's not an issue in this case.  Which is
why I was asking you whether you have sufficient basis to believe this
to be so in this case.

> Since these two observations seemed to mean that you contradicted
> yourself, I was checking whether in fact I had misunderstood (so that
> for example one of my two observations above is wrong), or if your
> original understanding was incomplete (so that in fact your question
> about decode-coding-string is therefore misguided, because it can return
> a pure ASCII unibyte string (in the coding sense) which is nonetheless a
> multibyte string (in the sense that multibyte-string-p on it returns t).

I only used decode-coding-string because I remembered it as an easy
way of creating a multibyte ASCII string, when the coding-system is
UTF-8, that's all.  There was no contradiction in what I said, at
least not an intended one.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]