[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GNU Libc setenv() leaks.

From: Karl Hegbloom
Subject: GNU Libc setenv() leaks.
Date: Fri, 13 May 2005 15:21:58 -0700

[ Cc to bug-glibc; Follow up from cs333b only to cs333b unless relevant;
edit the headers please. ]

The GNU Libc v2.3.2 setenv() appears to be leaky.  I wanted to have a
look at it's implementation, so I:

 cd ~/src
 apt-src install glibc
 cd glibc-[TAB]
   (Install what's missing)
 debian/rules configure_libc

After that runs, you can "cd build-dir/glibc-[TAB]" and find the source.

setenv() is in sysdeps/generic/setenv.c

What I find there is that every time setenv() is called with replace=1,
the old string is not freed.  unsetenv() does not free them either.  The
thing is that if the __environ is the initial one, coming from the
sys_execve(), and you call setenv() with a new variable that's not
already present in the environment, the Libc allocates a new envp tuple
with one more entry available, and copies the pointers from the original
__environ to the new one.  It allocates a new string, using malloc(),
copies your var and value into it, and then adds the new string to the
new envp slot.  Finally, it stores the location of the new envp into a
global variable static to setenv.c, and also to the global symbol
__environ.  It uses a test of whether __environ is == it's own
last_environ to see whether to malloc a new one or use realloc on one it
already owns.

It does not, however, keep track of which env vars have been allocated
as new strings with malloc().  It just throws them away when it replaces
a variable, and also when clearenv() is called.  So using setenv() very
much will lead to leaking malloc() heap.  Since the strings are probably
small and varied in size, it may be leaking a fair amount of internal
fragmentation as well.

My initial prescription idea was to create a malloc() related function
that can, given a pointer, determine (a) whether it's a pointer into an
area managed by malloc(), (b) return either that pointer, or the
starting address of the block it lies in, for use by free().

Given that ability, it would be easy for putenv() to free the old
strings when they point into malloc() area...

The problem then would be that a client program can hand a string to
putenv(), and that string is used as-is.  putenv() cannot blithely free
that string without the client code's knowledge or permission.

Making a full deep-copy of the envp is not a good solution either, since
it would use memory, and suffer from the same malloc() interactions with
client code mentioned previously.

Perhaps a tagged-malloc is needed?  But how would it ensure that no
client code is using the same tag for memory areas it allocates?  That
could, depending on the implementation, also add complexity to and slow
the malloc() / free() bookkeeping machinery.

The only solution I can see involves having the setenv() related
functions manage a block of memory for storing strings.  Perhaps
machinery similar to that used for the asprintf() would be useful for
that?  It would need to keep track of the start of the block, and it's
length, and then test pointers to see if they lie within that region.
If they do, then the string belongs to Libc.  Otherwise, it does not.

What I'm wondering now, and have not double checked on as of sending
this, is how Bash manages the environment.  If it's using setenv()
related functions, then it's leaky!  It probably uses it's own data
structures for that though, since it attaches attributes such as
'export', a type, and 'local' to them, plus it has array variables.

At the very least, the man page and "info" should reflect the fact that
excessive use of the setenv() functions will cause memory leakage.

Karl Hegbloom <address@hidden>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]