[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dynamic loading progress

From: Eli Zaretskii
Subject: Re: Dynamic loading progress
Date: Sun, 22 Nov 2015 21:50:57 +0200

> From: Philipp Stephani <address@hidden>
> Date: Sun, 22 Nov 2015 19:37:43 +0000
> Cc: address@hidden, address@hidden, address@hidden
> I think we shouldn't make the terminology more confusing. If we say
> "UTF-8", we should mean "UTF-8 as defined in the Unicode standard", not the
> Emacs extension of UTF-8. That's all.

I agree, and that's how I use "UTF-8".  The internal representation
used by Emacs is called "utf-8-emacs" or "emacs-internal".

>     We say that we accept valid UTF-8 encoded strings; anything else
>     might produce invalid UTF-8 on output.
> Couldn't we just say "it behaves as if encoding and decoding were done using
> the utf-8-unix coding system"? Because I think that's what this boils down to.

Not sure what you mean by "utf-8-unix", or why it would be better to
say that.  I think this makes the issue harder to understand, because
it involves a reference to the encoding/decoding stuff, something that
module authors might not be fluent with.

>     > No matter what we expect or tolerate, we need to state that.
>     No, we don't. When the callers violate the contract, they cannot
>     expect to know in detail what will happen. If they want to know, they
>     will have to read the source.
> So you want this to be unspecified or undefined behavior? That might be OK (we
> already have that in several places), but we still need to state what the
> contract is.

You can call it "undefined behavior" if you want.  Personally, I don't
think that's accurate: "undefined" means anything can happen, whereas
Emacs at least promises to output the original bytes unchanged, as
long as the text modifications didn't touch them.

>     > An Emacs string is a sequence of integers.
>     No, it's a sequence of bytes.
> From
> https://www.gnu.org/software/emacs/manual/html_node/elisp/String-Basics.html:
> "In Emacs Lisp, characters are simply integers ... A string is a fixed 
> sequence
> of characters"

That's the _User_ manual, it simplifies things to avoid too much

> How a string is represented internally shouldn't be the concern of module
> authors.

Indeed.  But it does concern us, the developers of Emacs internals.

> No, I will definitely fix it.

Thank you.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]