[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs Lisp's future

From: Mark H Weaver
Subject: Re: Emacs Lisp's future
Date: Tue, 07 Oct 2014 14:36:45 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.94 (gnu/linux)

David Kastrup <address@hidden> writes:

> Mark H Weaver <address@hidden> writes:
>> Andreas Schwab <address@hidden> writes:
>>> Mark H Weaver <address@hidden> writes:
>>>> However, if the overlong sequence came from the network, and Emacs
>>>> propagates it unchanged to internal subsystems[*] (e.g. via command-line
>>>> arguments to subprocesses), that's not good.  It exposes another program
>>>> to invalid input -- a program that might not be designed for exposure to
>>>> possible attacks via overlong encodings.
>>> At least it doesn't make it worse (it is unchanged from the situation if
>>> you remove Emacs as a filter).
>> In the case of mere "filtering", you might be right in some cases.
>> However, the case I'm worried about is where some small piece of the
>> hostile input is extracted and passed as an argument to another program.
>> In cases like this it doesn't make sense to think of emacs as a
>> "filter", and you'd never be able to "remove" it.
>> It's like saying that a web application that passes unsanitized input to
>> an SQL query "doesn't make it worse", and that the situation is
>> unchanged from if you provided public access to the SQL database.
> If GUILE or Emacs is supposed to sanitize input, you tell it to sanitize
> input.  That's different from GUILE/Emacs deciding over your head what
> is good for your application.

I've already said more than once that I agree Guile and Emacs should
provide the *option* to handle invalid byte sequences transparently, if
explicitly requested to do so, and furthermore that this is appropriate
default behavior when editing files.

What I'm saying is that in most other cases, the codecs should be
strict, and therefore this should be the default behavior of the
underlying functions.  When users call an Emacs function to decode
UTF-8, it should report an error if that input isn't actually UTF-8.
Conversely, when encoding UTF-8, the output should be UTF-8 and not some
arbitrary byte sequence.

Relying on users to explicitly sanitize the result of decoding UTF-8 to
check for "raw bytes", and to explicitly check for "raw bytes" before
encoding UTF-8 (as if that term didn't already have a well-known meaning
that excludes arbitrary byte sequences) is a recipe for security holes.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]