chicken-janitors
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequenc


From: Chicken Trac
Subject: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences
Date: Fri, 27 Mar 2015 23:08:44 -0000

#1182: utf8 egg silently accepts invalid byte sequences
------------------------+---------------------------------------------------
 Reporter:  syn         |       Owner:  ashinn 
     Type:  defect      |      Status:  new    
 Priority:  major       |   Milestone:  someday
Component:  extensions  |     Version:  4.9.x  
 Keywords:  utf8        |  
------------------------+---------------------------------------------------
 I noticed that some procedures of the `utf8` egg silently accept invalid
 byte sequences. This might have some safety implications, e.g. consider
 this case (the procedures used are the core versions, procedures from the
 `utf8` egg are prefixed with `utf8-` in the following code snippets):

 {{{
 (define evil-quote
   (list->string (map integer->char '(#b11000000 #b10100111))))
 }}}

 This is an invalid (overlong) UTF-8 encoding of the `'` character. Now a
 program could perform a check like this to make sure a user supplied
 string doesn't contain any quotes:

 {{{
 (unless (utf8-string-contains evil-quote "'") ...)
 }}}

 And then go ahead and write it character by character like this:

 {{{
 (utf8-string-for-each display evil-quote)
 }}}

 Which would produce the actual `'` character. The same is true for any
 other procedure that produces characters from strings, e.g. `string-ref`,
 `string->list`, etc.

 Any other invalid byte sequence (such as stray continuation bytes) is also
 silently accepted.

 I'm not entirely sure what would be the wisest way to handle this. We
 could have these procedures signal an error or just mention this behavior
 in the documentation so that people know to perform validation on
 untrusted inputs.

-- 
Ticket URL: <http://bugs.call-cc.org/ticket/1182>
CHICKEN Scheme <http://www.call-with-current-continuation.org/>
CHICKEN Scheme is a compiler for the Scheme programming language.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]