texinfo-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]


From: Patrice Dumas
Date: Sun, 23 Oct 2022 16:46:29 -0400 (EDT)

branch: master
commit c4951e6dccad2c4f9b837e2f4c52914ef67b124f
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Sun Oct 23 22:30:11 2022 +0200

    * tp/tests/run_parser_all.sh,
    tp/maintain/copy_change_file_name_encoding.pl: consider that
    recoding of file name to Latin1 as needed by some tests is not
    reliable on Windows, as the non-ASCII character may be stored
    in the filesystem as an Unicode point using the current codepage
    to convert, and not the Latin1 encoding expected from the Perl code.
    Add comments to explain that.  Report from Eli Zaretskii.
---
 ChangeLog                                     | 10 ++++++++++
 tp/maintain/copy_change_file_name_encoding.pl |  6 ++++++
 tp/tests/run_parser_all.sh                    |  7 +++++++
 3 files changed, 23 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 92c062354d..4c791be540 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2022-10-23  Patrice Dumas  <pertusus@free.fr>
+
+       * tp/tests/run_parser_all.sh,
+       tp/maintain/copy_change_file_name_encoding.pl: consider that
+       recoding of file name to Latin1 as needed by some tests is not
+       reliable on Windows, as the non-ASCII character may be stored
+       in the filesystem as an Unicode point using the current codepage
+       to convert, and not the Latin1 encoding expected from the Perl code.
+       Add comments to explain that.  Report from Eli Zaretskii.
+
 2022-10-23  Patrice Dumas  <pertusus@free.fr>
 
        * configure.ac (HOST_IS_WINDOWS_VARIABLE), tp/defs.in: add variable
diff --git a/tp/maintain/copy_change_file_name_encoding.pl 
b/tp/maintain/copy_change_file_name_encoding.pl
index fdc5b589dd..f919f443e9 100755
--- a/tp/maintain/copy_change_file_name_encoding.pl
+++ b/tp/maintain/copy_change_file_name_encoding.pl
@@ -61,6 +61,12 @@ $converted_dest_path =~ s/latin/latîn/;
 my $dest_path_in_utf8 = Encode::encode('UTF-8', $converted_dest_path);
 # use another variable, since from_to argument is converted in-place
 my $dest_path_in_to_encoding = $dest_path_in_utf8;
+# NOTE on Windows, when Perl uses the char API and not wchar_t API,
+# the file name written to the filesystem may not correspond to î, as
+# it depends on the codepage.  If the codepage is not Latin1, Windows will
+# consider that \xEE, output by Perl for î if $to is ISO-8859-1, is the
+# \xEE character in the current codepage, and convert to UTF-16 to store on
+# the filesystem.
 my $succeeded = from_to($dest_path_in_to_encoding, 'UTF-8', $to);
 
 if (not defined($succeeded)) {
diff --git a/tp/tests/run_parser_all.sh b/tp/tests/run_parser_all.sh
index cfd24aebfe..a5ec97d169 100755
--- a/tp/tests/run_parser_all.sh
+++ b/tp/tests/run_parser_all.sh
@@ -208,6 +208,13 @@ no_recoded_file_names=yes
 if sed 1q input_file_names_recoded_stamp.txt | grep 'OK' >/dev/null; then
   no_recoded_file_names=no
 fi
+# In Windows the recoding of file name is not reliable, as the file name may
+# be stored as UTF-16 using the user codepage to determine which codepage
+# the non-ASCII character comes from, and not the codepage that would have been
+# expected from the Perl code (Latin1 in the current case).
+if test "z$HOST_IS_WINDOWS_VARIABLE" = 'zyes' ; then
+  no_recoded_file_names=yes
+fi
 
 one_test=no
 if test -n "$1"; then



reply via email to

[Prev in Thread] Current Thread [Next in Thread]