[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Use the Unicode replacement character for replacing unencodable char
Re: Use the Unicode replacement character for replacing unencodable characters into UTF-16
Tue, 18 Aug 2020 21:13:35 +0300
> From: Mattias Engdegård <email@example.com>
> Date: Tue, 18 Aug 2020 19:07:41 +0200
> Cc: firstname.lastname@example.org
> 18 aug. 2020 kl. 18.19 skrev Eli Zaretskii <email@example.com>:
> > Can you describe under which circumstances this default-character will
> > be used?
> It's what encoding into UTF-16 uses for characters that don't have a Unicode
> equivalent, such as raw bytes.
My reading is that this happens only for codepoints beyond 0x10ffff.
Raw bytes end up there, but I'm not sure they always end up there.
Characters that aren't unified also end up there.
> > The issue that bothers me is whether u+FFFD can appear in situations
> > where it cannot be displayed by Emacs, because then the result will be
> > more confusing than helping.
> Do you mean that on balance, all things considered, you prefer space as
> replacement character to U+FFFD?
I mean if the situation that bother do in fact exist (I'm not sure
they do), we should discuss them and see whether we care about them.