emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#23750: 25.0.95; bug in url-retrieve or json.el


From: Eli Zaretskii
Subject: Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
Date: Wed, 28 Dec 2016 20:55:18 +0200

> From: Philipp Stephani <address@hidden>
> Date: Wed, 28 Dec 2016 18:45:43 +0000
> Cc: address@hidden, address@hidden, address@hidden, 
>       address@hidden
> 
>  > That has nothing to do with characters. A byte array is conceptually 
> different from a character string.
> 
>  In Emacs, they are both implemented using very similar objects.
> 
> Yes, that's why I said "conceptually different". The concepts may be the 
> different, but the implementation
> might still be the same.

If the implementation is the same, then concepts are not very
different to begin with, and the abstraction will sooner or later
leak into applications.

>  Our experience is that we should keep use of unibyte strings in Lisp
>  application code to the absolute minimum, ideally zero. Once we
>  arrived at that conclusion, we've been living happily ever after.
>  This minor issue we are discussing here is certainly not worth
>  repeating past mistakes for which we paid plenty in sweat and blood.
> 
> If you want unibyte strings to represent octet streams, then unibyte strings 
> must be usable in application
> code

They are usable, but using them requires knowledge and proficiency
that's unusual with many Lisp developers, and it also has some
unpleasant pitfalls.

> because octet streams are a concept that exists in reality, and applications 
> must be able to support
> them in some way. If you don't want unibyte strings, then you need to provide 
> some different way to represent
> octet streams. 

We use unibyte strings where we must, and otherwise prefer multibyte
ones.  In most cases the unibyte strings exist in Emacs internals, so
that Lisp applications will not have to deal with them.  This case is
one of the few exceptions.

If you are still unconvinced and think that we need some separate
representation for byte arrays, consider this: when Emacs starts, it
takes some time until it bootstraps itself enough to learn how to
decode non-ASCII strings, such as file names.  Until then, all file
names are unibyte strings, and Emacs still must handle them correctly,
because otherwise it would be impossible to build or start it in a
directory that includes non-ASCII characters.

This and other similar subtleties are the reason why using anything
but a string for raw byte arrays is not a good idea, IMO and IME.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]