[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] decodetree: Open files with encoding='utf-8'
From: |
Eduardo Habkost |
Subject: |
Re: [PATCH] decodetree: Open files with encoding='utf-8' |
Date: |
Fri, 8 Jan 2021 11:43:55 -0500 |
On Sat, Jan 09, 2021 at 12:13:31AM +0800, 罗勇刚(Yonggang Luo) wrote:
> On Sat, Jan 9, 2021 at 12:05 AM Peter Maydell <peter.maydell@linaro.org>
> wrote:
> >
> > On Fri, 8 Jan 2021 at 15:16, Philippe Mathieu-Daudé <f4bug@amsat.org>
> wrote:
> > >
> > > When decodetree.py was added in commit 568ae7efae7, QEMU was
> > > using Python 2 which happily reads UTF-8 files in text mode.
> > > Python 3 requires either UTF-8 locale or an explicit encoding
> > > passed to open(). Now that Python 3 is required, explicit
> > > UTF-8 encoding for decodetree sources.
> > >
> > > This fixes:
> > >
> > > $ /usr/bin/python3 scripts/decodetree.py test.decode
> > > Traceback (most recent call last):
> > > File "scripts/decodetree.py", line 1397, in <module>
> > > main()
> > > File "scripts/decodetree.py", line 1308, in main
> > > parse_file(f, toppat)
> > > File "scripts/decodetree.py", line 994, in parse_file
> > > for line in f:
> > > File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> > > return codecs.ascii_decode(input, self.errors)[0]
> > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
> 80:
> > > ordinal not in range(128)
> > >
> > > Reported-by: Peter Maydell <peter.maydell@linaro.org>
> > > Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > ---
> > > scripts/decodetree.py | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/scripts/decodetree.py b/scripts/decodetree.py
> > > index 47aa9caf6d1..fa40903cff1 100644
> > > --- a/scripts/decodetree.py
> > > +++ b/scripts/decodetree.py
> > > @@ -1304,7 +1304,7 @@ def main():
> > >
> > > for filename in args:
> > > input_file = filename
> > > - f = open(filename, 'r')
> > > + f = open(filename, 'r', encoding='utf-8')
> > > parse_file(f, toppat)
> > > f.close()
> >
> > Should we also be opening the output file explicitly as
> > utf-8 ? (How do we say "write to sys.stdout as utf-8" for
> > the case where we're doing that?)
>
> Can be done with
> ```
> sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding="utf8",
> errors="ignore")
> ```
In the specific case of decodetree, just assigning this to
`output_fd` is enough, and less hacky than overwriting
`sys.stdout`.
--
Eduardo