wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wget2 | Buffer overflow in `wget_iri_clone` (#687)


From: @gleurent
Subject: wget2 | Buffer overflow in `wget_iri_clone` (#687)
Date: Thu, 19 Dec 2024 14:22:47 +0000


Gaëtan Leurent created an issue: https://gitlab.com/gnuwget/wget2/-/issues/687



There is a bug when `wget_iri_clone` is called after `wget_iri_set_scheme`.

As far as I understand, the problem is that `wget_iri_clone` makes two 
assumptions about the `wget_iri` structure, that are true when the structure is 
initialized by `wget_iri_parse`, but become false after calling 
`wget_iri_set_scheme`:
- the `iri->uri` field points just after the end of the structure;
- the structure is followed by a string of size `strlen(iri->uri)` and by extra 
memory of size `iri->msize`.

However, `wget_iri_set_scheme` breaks those assumptions:
- it allocate `iri->uri` to a different memory location;
- it can change the length of `iri->uri`.

Therefore, when `wget_iri_clone` is called after `wget_iri_set_scheme`, there 
are two issues:
- on [line 723](libwget/iri.c#L723), `memcpy` reads `strlen(iri->uri) + 
iri->msize + 1` bytes at the location `iri->uri`, but the string only has 
`strlen(iri->uri) + 1` bytes, and the remaining `iri->msize` bytes are in a 
different location;
- if the length of `iri->uri` has changed, there is no way to reliably find the 
location of the extra `iri->msize` bytes.

There doesn't seem to be a clean way to fix this while keeping the current 
memory layout for the structure and for the extra strings after the structure.

I'm attaching a merge request that switches the order of the two extra strings 
after the `wget_iri` structure.  This allows to clone the `wget_iri` structure 
by copying `iri->msize` bytes after the structure, and then dealing with the 
zero-terminated string `iri->uri`.  It should be safe if no other function 
makes assumptions about the memory layout of fields after the structure.

A cleaner solution would probably be to modify the `wget_iri` structure to 
include more information about the extra fields, and strict rules about where 
they are allocated, but this requires more knowledge about the `wget2` code :-)

I'm marking the issue as confidential because buffer overflows potentially lead 
to security issues.  In theory, an attacker could create a malicious website so 
that mirroring it with wget would lead to remote code execution, but I have no 
idea how realistic this would be.

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/-/issues/687
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]