[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gawk] Minor Gawk bug
From: |
Chris Lauderdale |
Subject: |
[bug-gawk] Minor Gawk bug |
Date: |
Fri, 22 Apr 2016 12:34:34 -0400 |
I've tested on Linux vers. 3.0.0, 3.1.0, 4.0.0, 4.0.1, and 4.1.3, and only on
the last one do I get an internal error for this, so it's at least recent.
AFAICT it's engaged when one uses a regex range match that's ~incompatible with
the current character set (per LC_CTYPE); in my case, LC_CTYPE=C.UTF-8 by
default. Goes away if LC_CTYPE=C, with option -b, or for just [\x80] so I'd
guess it's trying to un-UTF8 the escapes in order to determine what the range
"means" and failing. Minimal reproducer:
BEGIN { "" ~ /[\x80-\xBF]/ }
Might oughta be an error (though that would tie program validity to the current
locale, the thought of which makes my skin crawl) but definitely shouldn't be
an abort.
- [bug-gawk] Minor Gawk bug,
Chris Lauderdale <=