bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] [bug #66289] undefined behavior: 32 bit values stored


From: Tim Sweet
Subject: [bug-gnu-libiconv] [bug #66289] undefined behavior: 32 bit values stored in an `int` in the 32-bit charsets like UTF-32LE
Date: Fri, 4 Oct 2024 00:04:23 -0400 (EDT)

URL:
  <https://savannah.gnu.org/bugs/?66289>

                 Summary: undefined behavior: 32 bit values stored in an `int`
in the 32-bit charsets like UTF-32LE
                   Group: libiconv
               Submitter: tsweet64
               Submitted: Fri 04 Oct 2024 04:04:19 AM UTC
                Category: Program
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Fri 04 Oct 2024 04:04:19 AM UTC By: Tim Sweet <tsweet64>
I compiled libiconv with the Undefined Behavior Sanitizer (UBSAN) `CC=gcc
CFLAGS="-Os -g -fno-omit-frame-pointer -fsanitize=undefined"`. I found that
all or most of the UTF-32 and UCS4 (32-bit) charsets have undefined behavior
detected when fed invalid input, and thus crash when ubsan is enabled. More
specifically, from what I've gathered, they seem to make invalid assumptions
about the size of `int` being predictable on all platforms, which could cause
an integer overflow on certain embedded systems with unusual but allowed int
sizes
(https://stackoverflow.com/questions/1231147/is-int-in-c-always-32-bit).

The following command can reproduce the bug:
echo -en 'ter\x91' | iconv -f UTF-32LE

With the output:
utf32le.h:30:59: runtime error: left shift of 145 by 24 places cannot be
represented in type 'int'

There the 32 bit integer is represented in an `int`. My assumption is that
`uint32_t` is more appropriate. This goes for all the other 32-bit charsets
too (other utf32s and the UCS4s).








    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66289>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]