[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #55334] [preconv] should not blindly use libuchardet on an unseekab
From: |
G. Branden Robinson |
Subject: |
[bug #55334] [preconv] should not blindly use libuchardet on an unseekable stream |
Date: |
Wed, 2 Mar 2022 14:24:17 -0500 (EST) |
Update of bug #55334 (project groff):
Status: In Progress => Fixed
Assigned to: gbranden => bgarrigues
Open/Closed: Open => Closed
Summary: preconv fails when built with libuchardet on
MS-Windows => [preconv] should not blindly use libuchardet on an unseekable
stream
_______________________________________________________
Follow-up Comment #21:
[comment #14 comment #14:]
> As a side note, if I run preconv with no arguments, I have to enter CTRL-D
on a line by itself _twice_ before it will exit. Is that a bug?
According to a git bisection...
This turns out to be the same issue as in the OP, or more precisely,
Bertrand's fix of 18 October 2020 resolved it as well.
I'm thinking this is due to nested reads from the same stream; preconv does
it, and then uchardet does it, but because the stream isn't seekable, you need
two EOFs to get out.
I'm therefore going to regard this issue as fixed.
I'll open a new issue for the seekable stream detection issue; I have a patch
pending for it.
Bug #59291, also spawned off of this ticket, requests that uchardet be run
even on unseekable streams, and that's a heavier lift. In the meantime, tools
like sponge(1) from Joey Hess's "moreutils" project are available, at least on
some systems.
Resolving and retitling.
commit c30821d767974c0a5e7220c0e5a65dc963785dc5
Author: Bertrand Garrigues <bertrand.garrigues@laposte.net>
Date: Sun Oct 18 00:53:32 2020 +0200
preconv: don't use libuchardet if input is stdin
* src/preproc/preconv/preconv.cpp (do_file): don't call
detect_file_encoding if input file is "-"
This fixes the failure on MS-Windows described #55334, however this
does not fix the encoding detection with uchardet if the input is
stdin (the user would have to pass with -D the correct encoding as
explained in the man page).
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?55334>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/