qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] decodetree: Open files with encoding='utf-8'


From: Yonggang Luo
Subject: Re: [PATCH v2] decodetree: Open files with encoding='utf-8'
Date: Fri, 8 Jan 2021 21:41:29 -0800



On Fri, Jan 8, 2021 at 10:58 AM Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> On Fri, Jan 08, 2021 at 07:09:52PM +0100, Philippe Mathieu-Daudé wrote:
> > When decodetree.py was added in commit 568ae7efae7, QEMU was
> > using Python 2 which happily reads UTF-8 files in text mode.
> > Python 3 requires either UTF-8 locale or an explicit encoding
> > passed to open(). Now that Python 3 is required, explicit
> > UTF-8 encoding for decodetree source files.
> >
> > To avoid further problems with the user locale, also explicit
> > UTF-8 encoding for the generated C files.
> >
> > Explicit both input/output are plain text by using the 't' mode.
>
> I believe the 't' is unnecessary.  But it's harmless and makes it
> more explicit.
>
> >
> > This fixes:
> >
> >   $ /usr/bin/python3 scripts/decodetree.py test.decode
> >   Traceback (most recent call last):
> >     File "scripts/decodetree.py", line 1397, in <module>
> >       main()
> >     File "scripts/decodetree.py", line 1308, in main
> >       parse_file(f, toppat)
> >     File "scripts/decodetree.py", line 994, in parse_file
> >       for line in f:
> >     File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> >       return codecs.ascii_decode(input, self.errors)[0]
> >   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 80:
> >   ordinal not in range(128)
> >
> > Reported-by: Peter Maydell <peter.maydell@linaro.org>
> > Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>
> However:
>
> > ---
> > v2: utf-8 output too (Peter)
> >     explicit default text mode.
> > ---
> >  scripts/decodetree.py | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/scripts/decodetree.py b/scripts/decodetree.py
> > index 47aa9caf6d1..d3857066cfc 100644
> > --- a/scripts/decodetree.py
> > +++ b/scripts/decodetree.py
> > @@ -1304,7 +1304,7 @@ def main():
> >
> >      for filename in args:
> >          input_file = filename
> > -        f = open(filename, 'r')
> > +        f = open(filename, 'rt', encoding='utf-8')
> >          parse_file(f, toppat)
> >          f.close()
> >
> > @@ -1324,7 +1324,7 @@ def main():
> >          prop_size(stree)
> >
> >      if output_file:
> > -        output_fd = open(output_file, 'w')
> > +        output_fd = open(output_file, 'wt', encoding='utf-8')

I misunderstand the cause, this is a better way

> >      else:
> >          output_fd = sys.stdout
>
> This will still use the user locale encoding for sys.stdout.  Can
> be solved with:
>
>     output_fd = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')

For output to console/terminal. I suggest to use
   sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding=sys.stdout.encoding, errors="ignore")
When the console/terminal encoding still can not represent the char in the decodetree, still won't 
cause script failure. And that failure can not be fixed by other means.
  errors="ignore" are important, from my experince, even there is `char` can not represent
in utf8

 
>
> (Based on a suggestion from Yonggang Luo)
>
> --
> Eduardo
>


--
         此致

罗勇刚
Yours
    sincerely,
Yonggang Luo

reply via email to

[Prev in Thread] Current Thread [Next in Thread]