[Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8

From:	anonymous
Subject:	[Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8 sequences
Date:	Tue, 02 Jun 2015 08:36:28 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.7.0

URL:
  <http://savannah.gnu.org/bugs/?45236>

                 Summary: Memory disclosure in wget using incomplete UTF-8
sequences
                 Project: GNU Wget
            Submitted by: None
            Submitted on: Tue 02 Jun 2015 08:36:26 AM UTC
                Category: Protocol Issue
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Private
             Assigned to: None
         Originator Name: Gustavo Grieco
        Originator Email: address@hidden
             Open/Closed: Open
         Discussion Lock: Any
                 Release: trunk
        Operating System: GNU/Linux
         Reproducibility: Every Time
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: No

    _______________________________________________________

Details:

Hello,

We discovered a vulnerability in the parsing and processing of international
domain names performed by the GNU IDN library in wget.
It affects systems using the UTF-8 locales and allows to read bytes outside
allocated buffers, using incomplete UTF-8 sequences.
The cause of this issue was already reported in March
(https://bugzilla.redhat.com/show_bug.cgi?id=1197796)
but the corresponding GNU developers haven't decided if they want to fix their
API or every affected program should validate their UTF-8 inputs.

As an example, we can use wget from Ubuntu 14.04 (64-bits) but we know that
the last git revision is affected as well as the versions shipped in Debian:

env -i CHARSET=UTF-8 valgrind /usr/bin/wget $(python -c "print '\xfc'")
==12139== Memcheck, a memory error detector
==12139== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==12139== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==12139== Command: /usr/bin/wget 
==12139== 
==12139== Invalid read of size 1
==12139==    at 0x578C207: stringprep_utf8_to_ucs4 (in
/usr/lib/x86_64-linux-gnu/libidn.so.11.6.8)
==12139==    by 0x578DC59: idna_to_ascii_8z (in
/usr/lib/x86_64-linux-gnu/libidn.so.11.6.8)
==12139==    by 0x42BE6C: ??? (in /usr/bin/wget)
==12139==    by 0x4277E1: ??? (in /usr/bin/wget)
==12139==    by 0x40507C: ??? (in /usr/bin/wget)
==12139==    by 0x5BE2EAC: (below main) (libc-start.c:244)
==12139==  Address 0x679b3d6 is 2 bytes after a block of size 4 alloc'd
==12139==    at 0x4C28BED: malloc (vg_replace_malloc.c:263)
==12139==    by 0x42F138: ??? (in /usr/bin/wget)
==12139==    by 0x428244: ??? (in /usr/bin/wget)
==12139==    by 0x427676: ??? (in /usr/bin/wget)
==12139==    by 0x40507C: ??? (in /usr/bin/wget)
==12139==    by 0x5BE2EAC: (below main) (libc-start.c:244)
==12139== 
==12139== Invalid read of size 1
==12139==    at 0x578C1BB: stringprep_utf8_to_ucs4 (in
/usr/lib/x86_64-linux-gnu/libidn.so.11.6.8)
==12139==    by 0x578DC59: idna_to_ascii_8z (in
/usr/lib/x86_64-linux-gnu/libidn.so.11.6.8)
==12139==    by 0x42BE6C: ??? (in /usr/bin/wget)
==12139==    by 0x4277E1: ??? (in /usr/bin/wget)
==12139==    by 0x40507C: ??? (in /usr/bin/wget)
==12139==    by 0x5BE2EAC: (below main) (libc-start.c:244)
==12139==  Address 0x679b3d4 is 0 bytes after a block of size 4 alloc'd
==12139==    at 0x4C28BED: malloc (vg_replace_malloc.c:263)
==12139==    by 0x42F138: ??? (in /usr/bin/wget)
==12139==    by 0x428244: ??? (in /usr/bin/wget)
==12139==    by 0x427676: ??? (in /usr/bin/wget)
==12139==    by 0x40507C: ??? (in /usr/bin/wget)
==12139==    by 0x5BE2EAC: (below main) (libc-start.c:244)
==12139== 
--2015-06-02 09:44:43--  http://xn--mz306e/
Resolving \370\243\200\200\200 (xn--mz306e)... failed: Name or service not
known.
wget: unable to resolve host address `xn--mz306e'
--2015-06-02 09:44:44--  http:///
Resolving  ()... failed: Name or service not known.
wget: unable to resolve host address `'

In this example wget returns an international domain encoding the bytes next
to the heap buffer that contains the domain. Running this example without
valgring will most likely make these domains to change at every execution.

It is interesting to see that these buffers are usually next to a null byte,
but in fact using this vulnerability, we can bypass some nulls at the end of
the string and continue reading.

This bug was co-discovered with one of my colleges here in VERIMAG: Josselin
Feist.

Regards,
Gus.




    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?45236>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8 sequences, anonymous <=
- Re: [Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8 sequences, Ander Juaristi, 2015/06/02
  - Re: [Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8 sequences, Ángel González, 2015/06/02

Next by Date: Re: [Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8 sequences
Next by thread: Re: [Bug-wget] [bug #45236] Memory disclosure in wget using incomplete UTF-8 sequences
Index(es):
- Date
- Thread