[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Records longer than INT_MAX mishandled
From: |
Miguel Pineiro Jr. |
Subject: |
Records longer than INT_MAX mishandled |
Date: |
Mon, 03 May 2021 21:22:27 -0400 |
User-agent: |
Cyrus-JMAP/3.5.0-alpha0-403-gbc3c488b23-fm-20210419.005-gbc3c488b |
Hello, gawk devs.
gawk mishandles records longer than INT_MAX when get_a_record stuffs their
size_t length in an int (io.c:4081: `retval = recm.len`).
All of the following examples are paired, first a success using a record of
length INT_MAX, then a failure using INT_MAX + 1.
In the main i/o loop, records vanish when their corrupted length is negative,
since inrec doesn't consider a negative value a valid record.
$ gawk 'BEGIN {printf("%2147483647s\n", "a")}' | gawk 'END {print NR}'
1
$ gawk 'BEGIN {printf("%2147483648s\n", "a")}' | gawk 'END {print NR}'
0
In getline (do_getline/do_getline_redir), if the corrupted length is equal to
EOF, it will trigger a silent bypass of the rest of the file. More likely, some
other value will mislead buffer memory management routines and crash gawk.
This bare getline fails fatally in set_record's buffer resizing loop, when it
gives up trying to accomodate what it thinks is a humongous record
(field.c:284: `cnt >= databuf_size` promotes a negative int cnt to unsigned
long).
$ gawk 'BEGIN {printf("\n%2147483647s\n", "a")}' | gawk '{getline} END {print
NR}'
2
$ gawk 'BEGIN {printf("\n%2147483648s\n", "a")}' | gawk '{getline} END {print
NR}'
gawk: cmd. line:1: (FILENAME=- FNR=2) fatal: input record too large
This getline var dies in make_string (make_str_node) from a corrupted
allocation request:
$ gawk 'BEGIN {printf("\n%2147483647s\n", "a")}' | gawk '{getline var} END
{print NR}'
2
$ gawk 'BEGIN {printf("\n%2147483648s\n", "a")}' | gawk '{getline var} END
{print NR}'
gawk: cmd. line:1: (FILENAME=- FNR=2) fatal: node.c:415:make_str_node:
r->stptr: cannot allocate -2147483647 bytes of memory: Cannot allocate memory
If INT_MAX is deemed sufficient, despite the use of capacious size_t i/o
buffers, here's a diff.
diff --git a/io.c b/io.c
index 91c94d9b..4e777d75 100644
--- a/io.c
+++ b/io.c
@@ -4026,6 +4026,9 @@ get_a_record(char **out, /* pointer to pointer to
data */
iop->dataend += iop->count;
}
+ if (recm.len > INT_MAX)
+ fatal(_("input record length too large to return"));
+
/* set record, RT, return right value */
/*
- Records longer than INT_MAX mishandled,
Miguel Pineiro Jr. <=