[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: test results differents between the perl and XS parsers
From: |
Patrice Dumas |
Subject: |
Re: test results differents between the perl and XS parsers |
Date: |
Tue, 17 Nov 2020 12:50:55 +0100 |
On Tue, Nov 17, 2020 at 07:27:45AM +0000, Gavin Smith wrote:
> On Tue, Nov 17, 2020 at 12:18:31AM +0100, Patrice Dumas wrote:
> > +++ t/results/indices/encoding_index_latin1.pl.new 2020-11-17
> > 00:00:54.879434507 +0100
> > @@ -158,7 +158,7 @@
> > 'contents' => [
> > {
> > 'parent' => {},
> > - 'text' => "\x{e9} \x{e9}"
> > + 'text' => 'é é'
> > }
> > ],
> > 'extra' => {
> >
> > and same for encoding_index_latin1_enable_encoding,
> > encoding_index_utf8 and other similar tests.
> >
> > It seems like it is the only case of accented commands in parsed text.
> > Any idea on what's going on?
>
> The two strings appear to be the same string. The question is, why are
> they output differently? I don't know, and I will look into it when I
> have time. Things to look at include whether the string is stored internally
> as UTF-8 or Latin-1, and locale settings when the string is output.
I had a look at the test and in the tests the string input is
"\x{e9} \x{e9}", irrespective of the encoding. It is possible that this
test is artificial and works only correctly with the perl parser. When
I have time, I'll have a look at using an input file for the test (if it
isn't already existing) instead to have something more similar with
actual processing.
--
Pat