nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't conver


From: Ralph Corderoy
Subject: Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8')
Date: Thu, 23 Nov 2017 16:22:41 +0000

Hi Leonardo,

> After finding that having the `libiconv' package installed made a
> difference I first looked if the several nmh binaries was linked
> against the GNU iconv(3) or the NetBSD iconv(3) and in both cases it's
> correctly linked to the NetBSD iconv(3).

So NetBSD has two iconv implementations available, and both supply a
library and iconv(1)?  Can both packages be installed at the same time?
It sounds like it.  And nmh correctly picks the "native" NetBSD library.
Which package provides the iconv in $PATH when both are installed?  And
is /usr/pkg/bin/iconv the other one?

> 28     #### For unknown reasons, the parameter values checks fail on the
> 29     #### FreeBSD10 buildbot.  It doesn't support EBCDIC-US, which is used
> 30     #### by the checks, so check for that.  Though that doesn't seem to be
> 31     #### the reason.
> 32     printf '\xe4' | iconv -f EBCDIC-US -t UTF-8 >/dev/null 2>&1  ||
> 33         skip_param_value_checks=1

So with your original report, this test passed, skip_param_value_checks
remained 0, and thus the failing test was later run.

> And, with NetBSD iconv(1) I have:
>
>  % printf '\xe4' | /usr/bin/iconv -f EBCDIC-US -t UTF-8
>  U

Good.  So that's what the above iconv test used because...

> ...while with iconv(1) provided by the `libiconv' package:
>
>  % printf '\xe4' | /usr/pkg/bin/iconv -f EBCDIC-US -t UTF-8
>  /usr/pkg/bin/iconv: conversion from EBCDIC-US unsupported
>  /usr/pkg/bin/iconv: try '/usr/pkg/bin/iconv -l' to get the list of supported 
> encodings
>  % echo $?
>  1

> So, in if GNU iconv(1) is available `$skip_param_value_checks' is
> set to 1.

Yes, on your platform, if it's the iconv chosen by the user's PATH.

> I'm now curious if apart FreeBSD and NetBSD with `libiconv' package
> installed what happens on other platforms, just checking the exit
> status of: 
>
>  $ printf '\xe4' | iconv -f EBCDIC-US -t UTF-8
>
> will be probably enough.

Don't quite understand the question.  Here on Arch Linux, ICONV_ENABLED
is 1 so that `printf | iconv' does get run and works so the last two
tests don't get skipped.  That's with iconv(1) from glibc 2.26.

> If the exit status is 0 and then, in test-charset context
> `$skip_param_value_checks' is 0, what happens if you try (this is
> stolen entirely from 'replacement character in parameter value' test
> in test-charset):
>
>  $ printf "Subject: invalid parameter value charset\nMIME-Version: 
> 1.0\nContent-Type: text/plain; charset*=invalid'
> '%%0Dus-ascii\n" | \
>  mhshow -file - | cat

The test passes here, so I get the expected output.  (At the command
line I get slightly different, but that's my ~/.mh_profile, etc.,
kicking in.)

    start_test 'replacement character in parameter value'
    #### The output of this test doesn't show it, but it covers the
    #### noiconv: portion of get_param_value().
    cat > $msgfile <<'EOF'
    Subject: invalid parameter value charset
    MIME-Version: 1.0
    Content-Type: text/plain; charset*=invalid''%0Dus-ascii
    EOF

    cat > $expected <<EOF
    [ Message inbox:12 ]
    Subject: invalid parameter value charset

    MIME-Version: 1.0

    [ part  - text/plain -   0B  ]
    EOF

> Here, I have:
>
> | Subject: invalid parameter value charset
> | 
> | mhshow: Can't convert ?us-ascii to UTF-8
> | mhshow: unable to convert character set from ?us-ascii, continuing...
> | [ part  - text/plain -   0B  ]

It seems reasonable that `?us-ascii', with a U+3F question mark at the
start, is an invalid source charset.  Yet mhshow is calling
iconv_open(3) with it here and that's happy.  If I change
content_charset()'s

    ret_charset = get_param(ct->c_ctinfo.ci_first_pm, "charset", '?', 0);

to use a `x' instead then I get a similar mhshow error to you, but with
`xus-ascii'.  So what's special about the question mark to glibc's
iconv_open() that gives `?' a free ride?

I also find this works, oddly.

    $ printf '\344' |
    > iconv -f EBCDIC-US -t '???us-as???cii???'; printf \\n
    U
    $

I run out of answers at this point and will do a bit more digging,
unless someone else here already knows.  Was the `?' replacement
character chosen deliberately in content_charset() to exploit this?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]