[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gawk mbstate_t problem on hppa2.0w-hp-hpux11.11
From: |
Michael Elizabeth Chastain |
Subject: |
Re: gawk mbstate_t problem on hppa2.0w-hp-hpux11.11 |
Date: |
Tue, 9 Dec 2003 23:31:14 -0500 (EST) |
[Let's try the right mailing list this time. Oops.]
Hi Stepan,
In my last email I outlined two proposals:
(A) Require both mbrtowc and mbstate_t
(A1) Add some documentation to README_d/README.hpux which says that the
user can turn on _XOPEN_SOURCE=500 if they want.
(A2) If _XOPEN_SOURCE=500 is on, then both mbrtowc and mbstate_t are
available. Enable multi-byte support.
(A3) If _XOPEN_SOURCE=500 is off, then mbrtowc is available, but
mbstate_t is not. Disable multi-byte support.
(B) Require mbrtowc, but mbstate_t is optional
(B1) Add some documentation to README_d/README.hpux which says that the
user can turn on _XOPEN_SOURCE=500 if they want.
(B2) If _XOPEN_SOURCE=500 is on, then both mbrtowc and mbstate_t are
available. Enable multi-byte support.
(B3) If _XOPEN_SOURCE=500 is off, then mbrtowc is available, but
mbstate_t is not. Enable multi-byte support without using
mbstate_t. Then, whenever an mbstate_t pointer is needed,
provide "0" as the pointer value.
Gnu Readline actually does (B). I liked (B) at first, but I spent some
time playing with it and I decided that I don't like it any more.
The problem is that gawk really wants to use mbstate_t. If there is no
mbstate_t on the platform, and I make a fake one, then I can get gawk to
compile. But a bunch of logic will not run as designed.
Look at strncasecmpmbs for example:
int
strncasecmpmbs(const char *s1, mbstate_t mbs1,
const char *s2, mbstate_t mbs2, size_t n)
{
...
for (...) {
...
mbclen1 = mbrtowc(&wc1, s1 + i1, n - i1, &mbs1);
...
mbclen2 = mbrtowc(&wc2, s2 + i2, n - i2, &mbs2);
...
}
...
}
This code will not work properly if mbs1 and mbs2 are nulled out
and both calls to mbrtowc use the builtin mbstate_t.
I still believe it's legal under Single Unix Spec v3 for a platform to
define mbrtowc and not define mbstate_t. And it's a fact that some
platforms actually do that, whether it's legal or not. And I think that
if an application processes only one string at a time and does not need
mbstate_t, then it can run on a platform like that. Such an application
can test HAVE_MBRTOWC and explicitly call mbrtowc(..., ..., ..., 0).
But gawk is not such an application.
So my plan is to change this code:
#if defined(HAVE_MBRLEN) && defined(HAVE_MBRTOWC) && defined(HAVE_WCHAR_H) &&
defined(HAVE_CTYPE_H)
/* We can handle multibyte strings. */
#define MBS_SUPPORT
#include <wchar.h>
#include <wctype.h>
#endif
To this:
#if defined(HAVE_MBRLEN) && defined(HAVE_MBRTOWC) && defined(HAVE_MBSTATE_T)
&& defined(HAVE_WCHAR_H) && defined(HAVE_CTYPE_H)
...
Then I have to test on hpux 11 with a variety of compilers and
with and without -D_XOPEN_SOURCE=500. And then write some
documentation.
How does that sound?
Michael C