[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs Lisp's future

From: David Kastrup
Subject: Re: Emacs Lisp's future
Date: Tue, 07 Oct 2014 20:56:31 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux)

Mark H Weaver <address@hidden> writes:

> Relying on users to explicitly sanitize the result of decoding UTF-8
> to check for "raw bytes", and to explicitly check for "raw bytes"
> before encoding UTF-8 (as if that term didn't already have a
> well-known meaning that excludes arbitrary byte sequences) is a recipe
> for security holes.

You are calling "application programmers" here "users" and call them
incapable of designing their application.  Any application in need of
sanitizing will not stop in its requirements at UTF-8 sanitization.

You cannot successfully cater for clueless application programmers.  And
nobody says that GUILE should _crash_ when provided non-sanitized UTF-8.
It has to be able to deal with everything thrown at it.  And you want it
to _not_ do that by default.  That means that _any_ programmer wanting
to do his own verification will not be able to use _any_ module provided
by someone else which does not explicitly override the defaults, since
then modules he has no control over will refuse cooperating.

GUILE is an extension language and system.  It should _not_ do policing.
Every attempt at policing makes it impossible to design the policing
into the place where it makes sense.

Worse, it leads to sloppy code since then people start to consider an
internal UTF-8 based encoding to be identical to an external UTF-8
encoding, making it _impossible_ to design byte-transparent workflows.

That is the current state of GUILE 2, and as an application programmer
I can testify that it is a huge headache.  Both in practice as well as

I am glad that Emacs started its history with a multibyte encoding
incompatible with any external encoding since that has given it lots of
impetus to get that distinction right.

With the "we don't want to cater for raw bytes by default" attitude
you'll never get away in a reasonably reliable manner from the "our code
will not deal with raw bytes" situation you have now with regard to
string manipulation.

It took Emacs years to get this into a really reliable and good state,
with many more active users of multibyte character sets than GUILE has.

David Kastrup

reply via email to

[Prev in Thread] Current Thread [Next in Thread]