libcdio-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libcdio-devel] How tolerant to be towards CD-TEXT character set mis


From: Thomas Schmitt
Subject: Re: [Libcdio-devel] How tolerant to be towards CD-TEXT character set mislabeling ?
Date: Mon, 29 Apr 2019 16:08:07 +0200

Hi,

Serge Pouliquen wrote:
> I got these lines :
> --DEBUG: CD-TEXT character set: code=1 , name=ASCII , chosen=ISO-8859-1
> --DEBUG: CD-TEXT character set: code=0 , name=ISO-8859-1 , chosen=ISO-8859-1

So we could in future ask users with CD-TEXT character set problems
to enable debugging messages rather than to pinch eyes and interpret
cdrskin text pack dumps.

Serge, if you are adventurous, then simulate a wrong code by

          /* determine encoding */
+         /* <<< only for a test of the default case <<< */
+         blocksize.charcode = 123;
          switch (blocksize.charcode){
            case CDTEXT_CHARCODE_ISO_8859_1:

and try it once with a CD, whether it properly shouts and then still
goes on with the usual output.

-----------------------------------------------------------------------

Meanwhile Rocky and Leon please review this change candidate (i assume
Leon was "greenleon" as mentioned in the commit from 2012-03-05). It has
become a bit fatter than the original one-liner. But it also fixes a
loophole for SIGSEGV.

--- lib/driver/cdtext.c 2018-06-14 17:26:07.742400554 +0200
+++ lib/driver/cdtext.bug53929.c        2019-04-29 16:00:03.324955834 +0200
@@ -713,17 +713,33 @@ cdtext_data_init(cdtext_t *p_cdtext, uin
         /* determine encoding */
         switch (blocksize.charcode){
           case CDTEXT_CHARCODE_ISO_8859_1:
-            /* default */
             charset = (char *) "ISO-8859-1";
             break;
           case CDTEXT_CHARCODE_ASCII:
-            charset = (char *) "ASCII";
+            /* ASCII is a subset of ISO-8859-1. Some CDs announce it but then
+             * have 8-bit characters in their text. Trying ISO-8859-1 gives
+             * more hope for a readable result than telling iconv to be picky.
+             */
+            charset = (char *) "ISO-8859-1";
             break;
           case CDTEXT_CHARCODE_SHIFT_JIS:
             charset = (char *) "SHIFT_JIS";
             break;
+          default:
+            /* Do not let charset pass here as NULL */
+            cdio_warn("CD-TEXT: Unknown character set code %u.\n",
+                      (unsigned int) blocksize.charcode);
+            charset = (char *) "ISO-8859-1";
         }

+        cdio_debug("CD-TEXT character set: code=%u , name=%s , chosen=%s\n",
+                   (unsigned int) blocksize.charcode,
+                   blocksize.charcode == 0 ? "ISO-8859-1" :
+                   blocksize.charcode == 1 ? "ASCII" :
+                   blocksize.charcode == 0x80 ? "SHIFT_JIS" :
+                   "",
+                   charset);
+
         /* set track numbers */
         p_cdtext->block[i_block].first_track = blocksize.i_first_track;
         p_cdtext->block[i_block].last_track = blocksize.i_last_track;

---------------------------------------------------------------------------

But for now the question is still open, whether my proposal to read ASCII
as ISO-8859-1 could yield unexpected problems.

(CP1252 was an example what happens to my theories if not all relevant
 facts have made their way into my mind.)


Have a nice day:)

Thomas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]