bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of memory and


From: green fox
Subject: [bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of memory and fails when using printf("%c") supplied with large floating point value.
Date: Thu, 10 Jul 2014 05:34:41 +0900

gawk 4.x series mmap attempts to alocates 32GB of memory and fails
when using printf("%c") supplied with large floating point value.

Known affected : 4.1.0 - 4.1.60
Known safe     : 3.1.8
Arch reproduced: x86_64 , AMD
Code to reproduce bug:
  gawk 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+255)));}'|xxd
Result:
  gawk: cmd. line:1: (FILENAME=- FNR=1) fatal: format_tree: obuf:
can't allocate 34359738368 bytes of memory (Cannot allocate memory)
Expected output:
  gawk should not mmap and re alocate (or try to do so and fail) 32GB
worth of memory.
  We are only doing a printf , sprintf call, there should be no room
nor need for such allocation.

Trace:

bash-4.2# whereis gawk
gawk: /bin/gawk /usr/bin/gawk /usr/X11R6/bin/gawk /usr/bin/X11/gawk
/usr/X11/bin/gawk /usr/man/man1/gawk.1.gz
/usr/share/man/man1/gawk.1.gz /usr/X11/man/man1/gawk.1.gz
bash-4.2# file /bin/gawk
/bin/gawk: symbolic link to `/usr/local/bin/gawk-4.1.0'
bash-4.2# file /usr/local/bin/gawk-4.1.0
/usr/local/bin/gawk-4.1.0: ELF 64-bit LSB executable, x86-64, version
1 (SYSV), dynamically linked (uses shared libs), not stripped
bash-4.2# gawk --version
GNU Awk 4.1.0, API: 1.0 (GNU MPFR 3.1.0, GNU MP 5.0.5)

bash-4.2# awk 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+127)));}'|xxd
0000000: 7f                                       .
bash-4.2# awk 'BEGIN{awk '{printf("%c",sprintf("%c",(0xffffff00+128)));}'|xxd
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: format_tree: obuf: can't
allocate 34359738368 bytes of memory (Cannot allocate memory)
 ...
bash-4.2# awk 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+254)));}'|xxd
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: format_tree: obuf: can't
allocate 34359738368 bytes of memory (Cannot allocate memory)
bash-4.2# awk 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+255)));}'|xxd
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: format_tree: obuf: can't
allocate 34359738368 bytes of memory (Cannot allocate memory)
bash-4.2# awk 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+256)));}'|xxd
0000000: 00                                       .

so why is it trying to allocate 32GB of memory is a mystery to me...

bash-4.2# strace awk 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+255)));}'|xxd
Full stack trace located at [http://pastebin.com/VtSdGLWK]

[...skipping the uninteresting stuff...]

rt_sigaction(SIGFPE, {0x4565d0, [FPE], SA_RESTORER|SA_RESTART,
0x7f1b87775b30}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGBUS, {0x4565d0, [BUS], SA_RESTORER|SA_RESTART,
0x7f1b87775b30}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSEGV, {0x4565d0, [SEGV], SA_RESTORER|SA_RESTART,
0x7f1b87775b30}, {SIG_DFL, [], 0}, 8) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
getgroups(0, NULL)                      = 14
getgroups(14, [0, 1, 2, 3, 4, 6, 7, 10, 11, 17, 18, 19, 93, 215]) = 14
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffd0794310) = -1 EINVAL
(Invalid argument)
brk(0x16f7000)                          = 0x16f7000
brk(0x16ef000)                          = 0x16ef000
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x7f1b88a71000
brk(0x16df000)                          = 0x16df000
mremap(0x7f1b88a71000, 266240, 528384, MREMAP_MAYMOVE) = 0x7f1b889f0000
mremap(0x7f1b889f0000, 528384, 1052672, MREMAP_MAYMOVE) = 0x7f1b8763e000
mremap(0x7f1b8763e000, 1052672, 2101248, MREMAP_MAYMOVE) = 0x7f1b8743d000
mremap(0x7f1b8743d000, 2101248, 4198400, MREMAP_MAYMOVE) = 0x7f1b8703c000
mremap(0x7f1b8703c000, 4198400, 8392704, MREMAP_MAYMOVE) = 0x7f1b8683b000
mremap(0x7f1b8683b000, 8392704, 16781312, MREMAP_MAYMOVE) = 0x7f1b8583a000
mremap(0x7f1b8583a000, 16781312, 33558528, MREMAP_MAYMOVE) = 0x7f1b83839000
mremap(0x7f1b83839000, 33558528, 67112960, MREMAP_MAYMOVE) = 0x7f1b7f838000
mremap(0x7f1b7f838000, 67112960, 134221824, MREMAP_MAYMOVE) = 0x7f1b77837000
mremap(0x7f1b77837000, 134221824, 268439552, MREMAP_MAYMOVE) = 0x7f1b67836000
mremap(0x7f1b67836000, 268439552, 536875008, MREMAP_MAYMOVE) = 0x7f1b47835000
mremap(0x7f1b47835000, 536875008, 1073745920, MREMAP_MAYMOVE) = 0x7f1b07834000
mremap(0x7f1b07834000, 1073745920, 2147487744, MREMAP_MAYMOVE) = 0x7f1a87833000
mremap(0x7f1a87833000, 2147487744, 4294971392, MREMAP_MAYMOVE) = 0x7f1987832000
mremap(0x7f1987832000, 4294971392, 8589938688, MREMAP_MAYMOVE) = 0x7f1787831000
mremap(0x7f1787831000, 8589938688, 17179873280, MREMAP_MAYMOVE) = 0x7f1387830000
mremap(0x7f1387830000, 17179873280, 34359742464, MREMAP_MAYMOVE) = -1
EFAULT (Bad address)
mmap(NULL, 34359742464, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x8016df000)                        = 0x16df000
mmap(NULL, 34359873536, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
open("/sys/devices/system/cpu/online", O_RDONLY|O_CLOEXEC) = 3
read(3, "0-1\n", 8192)                  = 4
close(3)                                = 0
mmap(NULL, 134217728, PROT_NONE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f037f80e000
munmap(0x7f037f80e000, 8331264)         = 0
munmap(0x7f0384000000, 58777600)        = 0
mprotect(0x7f0380000000, 135168, PROT_READ|PROT_WRITE) = 0
mmap(NULL, 34359742464, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US.utf8/LC_MESSAGES/libc.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/en.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/local/share/locale/en_US.UTF-8/LC_MESSAGES/gawk.mo",
O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/local/share/locale/en_US.utf8/LC_MESSAGES/gawk.mo",
O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/local/share/locale/en_US/LC_MESSAGES/gawk.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/local/share/locale/en.UTF-8/LC_MESSAGES/gawk.mo", O_RDONLY)
= -1 ENOENT (No such file or directory)
open("/usr/local/share/locale/en.utf8/LC_MESSAGES/gawk.mo", O_RDONLY)
= -1 ENOENT (No such file or directory)
open("/usr/local/share/locale/en/LC_MESSAGES/gawk.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
write(2, "awk: ", 5awk: )                    = 5
write(2, "cmd. line:", 10cmd. line:)              = 10
write(2, "1: ", 31: )                      = 3
write(2, "fatal: ", 7fatal: )                  = 7
write(2, "format_tree: obuf: can't allocat"..., 86format_tree: obuf:
can't allocate 34359738368 bytes of memory (Cannot allocate memory)) =
86
write(2, "\n", 1
)                       = 1
exit_group(2)                           = ?

humm...
266240 == 0x41000 is size to mmap();

lets try the latest and greatest
 ...
git clone git://git.sv.gnu.org/gawk.git ./gawk/ --depth=1 && cd gawk
&& ./bootstrap.sh && configure --prefix=/usr && make
 ...
bash-4.2# gawk-4.1.60 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+255)));}'|xxd
gawk-4.1.60: cmd. line:1: fatal: format_tree: obuf: can't allocate
34359738368 bytes of memory (Cannot allocate memory)

strace gawk-4.1.60 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+255)));}'|xxd

[...skipping the uninteresting stuff...]

rt_sigaction(SIGFPE, {0x457990, [FPE], SA_RESTORER|SA_RESTART,
0x7f6530ff0b30}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGBUS, {0x457990, [BUS], SA_RESTORER|SA_RESTART,
0x7f6530ff0b30}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGPIPE, {SIG_IGN, [PIPE], SA_RESTORER|SA_RESTART,
0x7f6530ff0b30}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSEGV, {0x457990, [SEGV], SA_RESTORER|SA_RESTART,
0x7f6530ff0b30}, {SIG_DFL, [], 0}, 8) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
getgroups(0, NULL)                      = 14
getgroups(14, [0, 1, 2, 3, 4, 6, 7, 10, 11, 17, 18, 19, 93, 215]) = 14
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffaf4dcdc0) = -1 EINVAL
(Invalid argument)
brk(0x14c4000)                          = 0x14c4000
brk(0x14bc000)                          = 0x14bc000
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x7f65322ec000
brk(0x14ac000)                          = 0x14ac000
mremap(0x7f65322ec000, 266240, 528384, MREMAP_MAYMOVE) = 0x7f653226b000
mremap(0x7f653226b000, 528384, 1052672, MREMAP_MAYMOVE) = 0x7f6530eb9000
mremap(0x7f6530eb9000, 1052672, 2101248, MREMAP_MAYMOVE) = 0x7f6530cb8000
mremap(0x7f6530cb8000, 2101248, 4198400, MREMAP_MAYMOVE) = 0x7f65308b7000
mremap(0x7f65308b7000, 4198400, 8392704, MREMAP_MAYMOVE) = 0x7f65300b6000
mremap(0x7f65300b6000, 8392704, 16781312, MREMAP_MAYMOVE) = 0x7f652f0b5000
mremap(0x7f652f0b5000, 16781312, 33558528, MREMAP_MAYMOVE) = 0x7f652d0b4000
mremap(0x7f652d0b4000, 33558528, 67112960, MREMAP_MAYMOVE) = 0x7f65290b3000
mremap(0x7f65290b3000, 67112960, 134221824, MREMAP_MAYMOVE) = 0x7f65210b2000
mremap(0x7f65210b2000, 134221824, 268439552, MREMAP_MAYMOVE) = 0x7f65110b1000
mremap(0x7f65110b1000, 268439552, 536875008, MREMAP_MAYMOVE) = 0x7f64f10b0000
mremap(0x7f64f10b0000, 536875008, 1073745920, MREMAP_MAYMOVE) = 0x7f64b10af000
mremap(0x7f64b10af000, 1073745920, 2147487744, MREMAP_MAYMOVE) = 0x7f64310ae000
mremap(0x7f64310ae000, 2147487744, 4294971392, MREMAP_MAYMOVE) = 0x7f63310ad000
mremap(0x7f63310ad000, 4294971392, 8589938688, MREMAP_MAYMOVE) = 0x7f61310ac000
mremap(0x7f61310ac000, 8589938688, 17179873280, MREMAP_MAYMOVE) = 0x7f5d310ab000
mremap(0x7f5d310ab000, 17179873280, 34359742464, MREMAP_MAYMOVE) = -1
EFAULT (Bad address)
mmap(NULL, 34359742464, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x8014ac000)                        = 0x14ac000
mmap(NULL, 34359873536, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
open("/sys/devices/system/cpu/online", O_RDONLY|O_CLOEXEC) = 3
read(3, "0-1\n", 8192)                  = 4
close(3)                                = 0
mmap(NULL, 134217728, PROT_NONE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f4d29089000
munmap(0x7f4d29089000, 49770496)        = 0
munmap(0x7f4d30000000, 17338368)        = 0
mprotect(0x7f4d2c000000, 135168, PROT_READ|PROT_WRITE) = 0
mmap(NULL, 34359742464, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)

[...open of /usr/share/locale/ and other uninteresting stuff...]

so its been going on since the 4.x series of gawk, includeing the
latest(4.1.60 as of 2014/Jul/10).

bash-4.2# gawk-3.1.8 'BEGIN{printf("%c",sprintf("%c",(0xffffff00+255)));}'|xxd
0000000: ff                                       .

pgawk 3.18 gives us a look at what it used to be like.

bash-4.2# gawk-4.1.60 -pprofile.txt 'BEGIN{printf "%c", sprintf("%c",
4.29496e+09 + 256)}'|xxd
gawk-4.1.60: cmd. line:1: fatal: format_tree: obuf: can't allocate
34359738368 bytes of memory (Cannot allocate memory)
bash-4.2# gawk-4.1.60 -pprofile.txt 'BEGIN{printf "%c", sprintf("%c",
4.29497e+09 + 256)}'|xxd
0000000: e0ae 90                                  ...

bash-4.2# gawk-3.1.8 -W profile=profile.txt 'BEGIN{printf "%c",
sprintf("%c", 4.29496e+09 + 256)}'|xxd
0000000: 80                                       .
bash-4.2# gawk-3.1.8 -W profile=profile.txt 'BEGIN{printf "%c",
sprintf("%c", 4.29497e+09 + 256)}'|xxd
0000000: 90                                       .

so we have a problem handling floating point input to sprintf.
This bug/feature can be traced back to 3.1.8 where it was used to dump
binary data of range 128-256.
bash-4.2# gawk-3.1.8 'BEGIN{printf("%c",sprintf("%c",(0xd800+255)));}'|xxd
0000000: ff                                       .

 As a side note, the rest of the world outside America, needs text
processing as well.
 Being unable to workaround the LANG limitation of gawk,
 many have used various workarounds to dump raw binary (to create a
valid character other than utf-8).
  sprintf("%c",(0xd800+255))
 was just one such hack, to allow write of binary data.
 It is well known that standards do not allow such things.
 Back then, it was a bug/improvised tech that allowed non-english text
to be processed.

 While sorting out this printf("%c") bug, can we ask for capability to
write out binary data ?
 Especiall,y when printf("%c",128) or avobe, it fails and dumps
0xc280, a valid utf-8, but not
 what was asked.
 There is no __safe-and-sane__ way to write data in the range of
0x80-0xff _without_ making it utf8 at the moment.
 Or, even better, gawk_locale_set("C") or something like that so we
can change the locale
 used from within the script...
 LANG=C gawk '{whatever}' is not that great...


 GreenFox



reply via email to

[Prev in Thread] Current Thread [Next in Thread]