guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] GNU Guile branch, master, updated. release_1-9-14-172-ge


From: Ludovic Courtès
Subject: [Guile-commits] GNU Guile branch, master, updated. release_1-9-14-172-geed98cb
Date: Wed, 02 Feb 2011 17:34:46 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU Guile".

http://git.savannah.gnu.org/cgit/guile.git/commit/?id=eed98cbc9274bdcd8aba7261325f658e00fa3663

The branch, master has been updated
       via  eed98cbc9274bdcd8aba7261325f658e00fa3663 (commit)
       via  6851d3be80bef46993e192b0f4ea708248f97921 (commit)
       via  7174bc08ddd6cc64f2ea321910042cf2233d586a (commit)
       via  b1e76e8f2c8997f011ca3a68ddfa2e448867108c (commit)
       via  b8fff11ed9bfdfe96dc86d1db97ab58286b5f8b5 (commit)
       via  c62da8f891569aa717f4f8557c7e87650ec92fb1 (commit)
       via  d6cf96974ea21b2f0deb1908baefb3dd2b704331 (commit)
       via  2d5ef30918df6534d84cf6c93d8a640fd0d5387d (commit)
      from  e1b6a3f150209ddc14e0bcf7ed770823978518a5 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit eed98cbc9274bdcd8aba7261325f658e00fa3663
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 18:00:49 2011 +0100

    R6RS: Have `put-char', `put-string', etc. raise an `&i/o-encoding-error'.
    
    * module/rnrs/io/ports.scm (&i/o-encoding): New error condition type.
      (with-i/o-encoding-error): New macro.
      (put-char, put-datum, put-string): Use it.
    
    * test-suite/tests/r6rs-ports.test ("8.2.6  Input and output
      ports")["transcoded-port, output [error handling mode = raise]"]: New
      test.

commit 6851d3be80bef46993e192b0f4ea708248f97921
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 17:38:03 2011 +0100

    Change `scm_encoding_error' to pass the port and faulty character.
    
    * libguile/strings.c (scm_encoding_error): Remove the `from', `to', and
      `string_or_bv' parameters; add `port' and `chr'.
      (scm_to_stringn): Update accordingly.
    
    * libguile/strings.h (scm_encoding_error): Update accordingly.
    
    * libguile/ports.c (scm_ungetc): Update accordingly.
    
    * libguile/print.c (iprin1, scm_write_char): Update accordingly.
    
    * test-suite/tests/encoding-escapes.test ("display output
      errors")["ultima", "Rashomon"]: Check the arguments of
      `encoding-error'.
      ["tekniko"]: New test.
    
    * test-suite/tests/ports.test ("string ports")["wrong encoding"]: Adjust
      to new `encoding-error' arguments.

commit 7174bc08ddd6cc64f2ea321910042cf2233d586a
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 17:33:12 2011 +0100

    Upon port encoding error, always write as much as possible.
    
    * libguile/print.c (display_string): Upon error, always write the
      OUTPUT_LEN bytes of output, regardless of the conversion strategy.

commit b1e76e8f2c8997f011ca3a68ddfa2e448867108c
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 16:19:48 2011 +0100

    R6RS: Have `get-char', `get-line', etc. raise an `&i/o-decoding-error'.
    
    * module/rnrs/io/ports.scm (&i/o-decoding): New error condition type.
      (with-i/o-decoding-error): New macro.
      (get-char, get-datum, get-line, get-string-all, lookahead-char): Use
      it.
    
    * test-suite/tests/r6rs-ports.test ("8.2.6  Input and output
      ports")["transcoded-port [error handling mode = raise]"]: Use `guard'
      and `i/o-decoding-error?'.

commit b8fff11ed9bfdfe96dc86d1db97ab58286b5f8b5
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 16:17:48 2011 +0100

    Avoid circular dependency between (rnrs base) and (rnrs exceptions).
    
    * module/rnrs/base.scm (raise): Define as a macro instead of a
      procedure.

commit c62da8f891569aa717f4f8557c7e87650ec92fb1
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 15:52:56 2011 +0100

    Have `read-char' & co. throw to `decoding-error'.
    
    * libguile/ports.c (scm_read_char): Mention `decoding-error' in the
      docstring.
      (get_codepoint): Change to return an error code; add `codepoint'
      output parameter.  Don't raise an error from here.
      (scm_getc): Raise an error with `scm_decoding_error' if
      `get_codepoint' returns an error.
      (scm_peek_char): Likewise.  Update docstring.
    
    * libguile/strings.c (scm_decoding_error_key): New variable.
      (scm_decoding_error): New function.
      (scm_from_stringn): Use `scm_decoding_error' instead of
      `scm_encoding_error'.
    
    * libguile/strings.h (scm_decoding_error): New declaration.
    
    * test-suite/tests/ports.test ("string ports")["read-char, wrong
      encoding, error"]: Change to expect `decoding-error'.  Make sure PORT
      points past the error.
      ["read-char, wrong encoding, escape"]: Likewise.
      ["peek-char, wrong encoding, error"]: New test.
    
    * test-suite/tests/r6rs-ports.test ("7.2.11 Binary
      Output")["put-bytevector with wrong-encoding string port"]: Change to
      expect `decoding-error'.
      ("8.2.6  Input and output ports")["transcoded-port [error handling
      mode = raise]"]: Likewise.
    
    * test-suite/tests/rdelim.test ("read-line")["decoding error", "decoding
      error, substitute"]: New tests.
    
    * doc/ref/api-io.texi (Reading): Update documentation of `read-char' and
      `peek-char'.
      (Line/Delimited): Update documentation of `read-line'.

commit d6cf96974ea21b2f0deb1908baefb3dd2b704331
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 15:13:07 2011 +0100

    Use `#ifdef HAVE_...', not `#if'.
    
    * test-suite/standalone/test-round.c (test_scm_c_round): Use `#ifdef
      HAVE_FESETROUND', not `#if'.

commit 2d5ef30918df6534d84cf6c93d8a640fd0d5387d
Author: Ludovic Courtès <address@hidden>
Date:   Wed Feb 2 15:12:26 2011 +0100

    Fix typo.
    
    * libguile/srfi-1.c (scm_srfi1_concatenate_x): Fix `FUNC_NAME'.

-----------------------------------------------------------------------

Summary of changes:
 doc/ref/api-io.texi                    |   16 ++++-
 libguile/ports.c                       |  124 +++++++++++++++++---------------
 libguile/print.c                       |   26 +++----
 libguile/srfi-1.c                      |    6 +-
 libguile/strings.c                     |   49 +++++++++----
 libguile/strings.h                     |    6 +-
 module/rnrs/base.scm                   |    9 ++-
 module/rnrs/io/ports.scm               |   82 +++++++++++++++++-----
 test-suite/standalone/test-round.c     |    4 +-
 test-suite/tests/encoding-escapes.test |   50 ++++++++++---
 test-suite/tests/ports.test            |   87 +++++++++++++++++-----
 test-suite/tests/r6rs-ports.test       |   35 ++++++----
 test-suite/tests/rdelim.test           |   25 ++++++-
 13 files changed, 357 insertions(+), 162 deletions(-)

diff --git a/doc/ref/api-io.texi b/doc/ref/api-io.texi
index 47dc8fc..e02daeb 100644
--- a/doc/ref/api-io.texi
+++ b/doc/ref/api-io.texi
@@ -1,7 +1,7 @@
 @c -*-texinfo-*-
 @c This is part of the GNU Guile Reference Manual.
address@hidden Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 
2009, 2010
address@hidden   Free Software Foundation, Inc.
address@hidden Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 
2009,
address@hidden   2010, 2011  Free Software Foundation, Inc.
 @c See the file guile.texi for copying conditions.
 
 @node Input and Output
@@ -210,6 +210,10 @@ interactive port that has no ready characters.
 Return the next character available from @var{port}, updating
 @var{port} to point to the following character.  If no more
 characters are available, the end-of-file object is returned.
+
+When @var{port}'s data cannot be decoded according to its
+character encoding, a @code{decoding-error} is raised and
address@hidden points past the erroneous byte sequence.
 @end deffn
 
 @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
@@ -238,6 +242,11 @@ return the value returned by the preceding call to
 @code{peek-char}.  In particular, a call to @code{peek-char} on
 an interactive port will hang waiting for input whenever a call
 to @code{read-char} would have hung.
+
+As for @code{read-char}, a @code{decoding-error} may be raised
+if such a situation occurs.  However, unlike with @code{read-char},
address@hidden still points at the beginning of the erroneous byte
+sequence when the error is raised.
 @end deffn
 
 @deffn {Scheme Procedure} unread-char cobj [port]
@@ -513,6 +522,9 @@ Push the terminating delimiter (if any) back on to the port.
 Return a pair containing the string read from the port and the
 terminating delimiter or end-of-file object.
 @end table
+
+Like @code{read-char}, this procedure can throw to @code{decoding-error}
+(@pxref{Reading, @code{read-char}}).
 @end deffn
 
 @c begin (scm-doc-string "rdelim.scm" "read-line!")
diff --git a/libguile/ports.c b/libguile/ports.c
index 1cfcba0..270071c 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1029,7 +1029,11 @@ SCM_DEFINE (scm_read_char, "read-char", 0, 1, 0,
            (SCM port),
            "Return the next character available from @var{port}, updating\n"
            "@var{port} to point to the following character.  If no more\n"
-           "characters are available, the end-of-file object is returned.")
+           "characters are available, the end-of-file object is returned.\n"
+           "\n"
+           "When @var{port}'s data cannot be decoded according to its\n"
+           "character encoding, a @code{decoding-error} is raised and\n"
+           "@var{port} points past the erroneous byte sequence.\n")
 #define FUNC_NAME s_scm_read_char
 {
   scm_t_wchar c;
@@ -1108,17 +1112,16 @@ utf8_to_codepoint (const scm_t_uint8 *utf8_buf, size_t 
size)
   return codepoint;
 }
 
-/* Read a codepoint from PORT and return it.  Fill BUF with the byte
-   representation of the codepoint in PORT's encoding, and set *LEN to
-   the length in bytes of that representation.  Raise an error on
-   failure.  */
-static scm_t_wchar
-get_codepoint (SCM port, char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
-#define FUNC_NAME "scm_getc"
+/* Read a codepoint from PORT and return it in *CODEPOINT.  Fill BUF
+   with the byte representation of the codepoint in PORT's encoding, and
+   set *LEN to the length in bytes of that representation.  Return 0 on
+   success and an errno value on error.  */
+static int
+get_codepoint (SCM port, scm_t_wchar *codepoint,
+              char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
 {
   int err, byte_read;
   size_t bytes_consumed, output_size;
-  scm_t_wchar codepoint;
   char *output;
   scm_t_uint8 utf8_buf[SCM_MBCHAR_BUF_SIZE];
   scm_t_port *pt = SCM_PTAB_ENTRY (port);
@@ -1140,7 +1143,11 @@ get_codepoint (SCM port, char buf[SCM_MBCHAR_BUF_SIZE], 
size_t *len)
       if (byte_read == EOF)
        {
          if (bytes_consumed == 0)
-           return (scm_t_wchar) EOF;
+           {
+             *codepoint = (scm_t_wchar) EOF;
+             *len = 0;
+             return 0;
+           }
          else
            continue;
        }
@@ -1164,53 +1171,52 @@ get_codepoint (SCM port, char buf[SCM_MBCHAR_BUF_SIZE], 
size_t *len)
        output_size = sizeof (utf8_buf) - output_left;
     }
 
-  if (err != 0)
+  if (SCM_UNLIKELY (err != 0))
     {
       /* Reset the `iconv' state.  */
       iconv (pt->input_cd, NULL, NULL, NULL, NULL);
 
       if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
-       codepoint = '?';
-      else
-       /* Fail when the strategy is SCM_ICONVEH_ERROR or
-          SCM_ICONVEH_ESCAPE_SEQUENCE (the latter doesn't make sense
-          for input encoding errors.)  */
-       goto failure;
+       {
+         *codepoint = '?';
+         err = 0;
+       }
+
+      /* Fail when the strategy is SCM_ICONVEH_ERROR or
+        SCM_ICONVEH_ESCAPE_SEQUENCE (the latter doesn't make sense for
+        input encoding errors.)  */
     }
   else
     /* Convert the UTF8_BUF sequence to a Unicode code point.  */
-    codepoint = utf8_to_codepoint (utf8_buf, output_size);
+    *codepoint = utf8_to_codepoint (utf8_buf, output_size);
 
-  update_port_lf (codepoint, port);
+  if (SCM_LIKELY (err == 0))
+    update_port_lf (*codepoint, port);
 
   *len = bytes_consumed;
 
-  return codepoint;
-
- failure:
-  {
-    SCM bv;
-
-    bv = scm_c_make_bytevector (bytes_consumed);
-    memcpy (SCM_BYTEVECTOR_CONTENTS (bv), buf, bytes_consumed);
-    scm_encoding_error (FUNC_NAME, err, "input decoding error",
-                       pt->encoding, "UTF-8", bv);
-  }
-
-  /* Never gets here.  */
-  return 0;
+  return err;
 }
-#undef FUNC_NAME
 
 /* Read a codepoint from PORT and return it.  */
 scm_t_wchar
 scm_getc (SCM port)
+#define FUNC_NAME "scm_getc"
 {
+  int err;
   size_t len;
+  scm_t_wchar codepoint;
   char buf[SCM_MBCHAR_BUF_SIZE];
 
-  return get_codepoint (port, buf, &len);
+  err = get_codepoint (port, &codepoint, buf, &len);
+  if (SCM_UNLIKELY (err != 0))
+    /* At this point PORT should point past the invalid encoding, as per
+       R6RS-lib Section 8.2.4.  */
+    scm_decoding_error (FUNC_NAME, err, "input decoding error", port);
+
+  return codepoint;
 }
+#undef FUNC_NAME
 
 /* this should only be called when the read buffer is empty.  it
    tries to refill the read buffer.  it returns the first char from
@@ -1565,15 +1571,9 @@ scm_ungetc (scm_t_wchar c, SCM port)
                                 result_buf, &len);
 
   if (SCM_UNLIKELY (result == NULL || len == 0))
-    {
-      SCM chr;
-
-      chr = scm_integer_to_char (scm_from_uint32 (c));
-      scm_encoding_error (FUNC_NAME, errno,
-                         "conversion to port encoding failed",
-                         "UTF-32", encoding,
-                         scm_string (scm_list_1 (chr)));
-    }
+    scm_encoding_error (FUNC_NAME, errno,
+                       "conversion to port encoding failed",
+                       SCM_BOOL_F, SCM_MAKE_CHAR (c));
 
   for (i = len - 1; i >= 0; i--)
     scm_unget_byte (result[i], port);
@@ -1623,13 +1623,19 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
            "return the value returned by the preceding call to\n"
            "@code{peek-char}.  In particular, a call to @code{peek-char} on\n"
            "an interactive port will hang waiting for input whenever a call\n"
-           "to @code{read-char} would have hung.")
+           "to @code{read-char} would have hung.\n"
+           "\n"
+           "As for @code{read-char}, a @code{decoding-error} may be raised\n"
+           "if such a situation occurs.  However, unlike with 
@code{read-char},\n"
+           "@var{port} still points at the beginning of the erroneous byte\n"
+           "sequence when the error is raised.\n")
 #define FUNC_NAME s_scm_peek_char
 {
+  int err;
   SCM result;
   scm_t_wchar c;
   char bytes[SCM_MBCHAR_BUF_SIZE];
-  long column, line;
+  long column, line, i;
   size_t len;
 
   if (SCM_UNBNDP (port))
@@ -1639,21 +1645,25 @@ SCM_DEFINE (scm_peek_char, "peek-char", 0, 1, 0,
   column = SCM_COL (port);
   line = SCM_LINUM (port);
 
-  c = get_codepoint (port, bytes, &len);
-  if (c == EOF)
-    result = SCM_EOF_VAL;
-  else
-    {
-      long i;
+  err = get_codepoint (port, &c, bytes, &len);
+
+  for (i = len - 1; i >= 0; i--)
+    scm_unget_byte (bytes[i], port);
 
-      result = SCM_MAKE_CHAR (c);
+  SCM_COL (port) = column;
+  SCM_LINUM (port) = line;
 
-      for (i = len - 1; i >= 0; i--)
-       scm_unget_byte (bytes[i], port);
+  if (SCM_UNLIKELY (err != 0))
+    {
+      scm_decoding_error (FUNC_NAME, err, "input decoding error", port);
 
-      SCM_COL (port) = column;
-      SCM_LINUM (port) = line;
+      /* Shouldn't happen since `catch' always aborts to prompt.  */
+      result = SCM_BOOL_F;
     }
+  else if (c == EOF)
+    result = SCM_EOF_VAL;
+  else
+    result = SCM_MAKE_CHAR (c);
 
   return result;
 }
diff --git a/libguile/print.c b/libguile/print.c
index 679327a..59b1093 100644
--- a/libguile/print.c
+++ b/libguile/print.c
@@ -484,8 +484,7 @@ iprin1 (SCM exp, SCM port, scm_print_state *pstate)
                                      scm_i_get_conversion_strategy (port)))
                scm_encoding_error (__func__, errno,
                                    "cannot convert to output locale",
-                                   "UTF-32", scm_i_get_port_encoding (port),
-                                   scm_string (scm_list_1 (exp)));
+                                   port, exp);
            }
        }
       else if (SCM_IFLAGP (exp)
@@ -570,11 +569,9 @@ iprin1 (SCM exp, SCM port, scm_print_state *pstate)
                                        len, port,
                                        scm_i_get_conversion_strategy (port));
              if (SCM_UNLIKELY (printed < len))
-               /* FIXME: Provide the error location.  */
                scm_encoding_error (__func__, errno,
                                    "cannot convert to output locale",
-                                   "UTF-32", scm_i_get_port_encoding (port),
-                                   exp);
+                                   port, scm_c_string_ref (exp, printed));
            }
 
           scm_remember_upto_here_1 (exp);
@@ -868,20 +865,20 @@ display_string (const void *str, int narrow_p,
          /* Reset the `iconv' state.  */
          iconv (pt->output_cd, NULL, NULL, NULL, NULL);
 
+         /* Print the OUTPUT_LEN bytes successfully converted.  */
+         scm_lfwrite (encoded_output, output_len, port);
+
+         /* See how many input codepoints these OUTPUT_LEN bytes
+            corresponds to.  */
+         codepoints_read = offsets[input - utf8_buf] - printed;
+         printed += codepoints_read;
+
          if (errno == EILSEQ &&
              strategy != SCM_FAILED_CONVERSION_ERROR)
            {
              /* Conversion failed somewhere in INPUT and we want to
                 escape or substitute the offending input character.  */
 
-             /* Print the OUTPUT_LEN bytes successfully converted.  */
-             scm_lfwrite (encoded_output, output_len, port);
-
-             /* See how many input codepoints these OUTPUT_LEN bytes
-                corresponds to.  */
-             codepoints_read = offsets[input - utf8_buf] - printed;
-             printed += codepoints_read;
-
              if (strategy == SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE)
                {
                  scm_t_wchar ch;
@@ -1388,8 +1385,7 @@ SCM_DEFINE (scm_write_char, "write-char", 1, 1, 0,
                          scm_i_get_conversion_strategy (port)))
     scm_encoding_error (__func__, errno,
                        "cannot convert to output locale",
-                       "UTF-32", scm_i_get_port_encoding (port),
-                       scm_string (scm_list_1 (chr)));
+                       port, chr);
 
   return SCM_UNSPECIFIED;
 }
diff --git a/libguile/srfi-1.c b/libguile/srfi-1.c
index e2a9c93..7931138 100644
--- a/libguile/srfi-1.c
+++ b/libguile/srfi-1.c
@@ -1,7 +1,7 @@
 /* srfi-1.c --- SRFI-1 procedures for Guile
  *
- *     Copyright (C) 1995, 1996, 1997, 2000, 2001, 2002, 2003, 2005, 2006, 
2008, 2009, 2010
- *     Free Software Foundation, Inc.
+ * Copyright (C) 1995, 1996, 1997, 2000, 2001, 2002, 2003, 2005, 2006,
+ *   2008, 2009, 2010, 2011 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -188,7 +188,7 @@ SCM_DEFINE (scm_srfi1_concatenate_x, "concatenate!", 1, 0, 
0,
            "have a limit on the number of arguments a function takes, which\n"
            "the @code{apply} might exceed.  In Guile there is no such\n"
            "limit.")
-#define FUNC_NAME s_scm_srfi1_concatenate
+#define FUNC_NAME s_scm_srfi1_concatenate_x
 {
   SCM_VALIDATE_LIST (SCM_ARG1, lstlst);
   return scm_append_x (lstlst);
diff --git a/libguile/strings.c b/libguile/strings.c
index 41998a9..b13cb78 100644
--- a/libguile/strings.c
+++ b/libguile/strings.c
@@ -1407,27 +1407,41 @@ SCM_DEFINE (scm_string_append, "string-append", 0, 0, 1,
 
 
 
-/* Conversion to/from other encodings.  */
+/* Charset conversion error handling.  */
 
 SCM_SYMBOL (scm_encoding_error_key, "encoding-error");
+SCM_SYMBOL (scm_decoding_error_key, "decoding-error");
+
+/* Raise an exception informing that character CHR could not be written
+   to PORT in its current encoding.  */
 void
 scm_encoding_error (const char *subr, int err, const char *message,
-                   const char *from, const char *to, SCM string_or_bv)
+                   SCM port, SCM chr)
 {
-  /* Raise an exception that conveys all the information needed to debug the
-     problem.  Only perform locale conversions that are safe; in particular,
-     don't try to display STRING_OR_BV when it's a string since converting it 
to
-     the output locale may fail.  */
   scm_throw (scm_encoding_error_key,
             scm_list_n (scm_from_locale_string (subr),
                         scm_from_locale_string (message),
                         scm_from_int (err),
-                        scm_from_locale_string (from),
-                        scm_from_locale_string (to),
-                        string_or_bv,
+                        port, chr,
                         SCM_UNDEFINED));
 }
 
+/* Raise an exception informing of an encoding error on PORT.  This
+   means that a character could not be written in PORT's encoding.  */
+void
+scm_decoding_error (const char *subr, int err, const char *message, SCM port)
+{
+  scm_throw (scm_decoding_error_key,
+            scm_list_n (scm_from_locale_string (subr),
+                        scm_from_locale_string (message),
+                        scm_from_int (err),
+                        port,
+                        SCM_UNDEFINED));
+}
+
+
+/* String conversion to/from C.  */
+
 SCM
 scm_from_stringn (const char *str, size_t len, const char *encoding,
                   scm_t_string_failed_conversion_handler handler)
@@ -1473,9 +1487,8 @@ scm_from_stringn (const char *str, size_t len, const char 
*encoding,
       memcpy (buf, str, len);
       bv = scm_c_take_bytevector (buf, len);
 
-      scm_encoding_error (__func__, errno,
-                         "input locale conversion error",
-                         encoding, "UTF-32", bv);
+      scm_decoding_error (__func__, errno,
+                         "input locale conversion error", bv);
     }
 
   i = 0;
@@ -1921,8 +1934,10 @@ scm_to_stringn (SCM str, size_t *lenp, const char 
*encoding,
 
       if (ret != 0)
         scm_encoding_error (__func__, errno,
-                           "cannot convert to output locale",
-                           "ISO-8859-1", enc, str);
+                           "cannot convert narrow string to output locale",
+                           SCM_BOOL_F,
+                           /* FIXME: Faulty character unknown.  */
+                           SCM_BOOL_F);
     }
   else
     {
@@ -1934,8 +1949,10 @@ scm_to_stringn (SCM str, size_t *lenp, const char 
*encoding,
                                   NULL, &len);
       if (buf == NULL)
         scm_encoding_error (__func__, errno,
-                           "cannot convert to output locale",
-                           "UTF-32", enc, str);
+                           "cannot convert wide string to output locale",
+                           SCM_BOOL_F,
+                           /* FIXME: Faulty character unknown.  */
+                           SCM_BOOL_F);
     }
   if (handler == SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE)
     {
diff --git a/libguile/strings.h b/libguile/strings.h
index b9e901b..ed3a067 100644
--- a/libguile/strings.h
+++ b/libguile/strings.h
@@ -211,9 +211,9 @@ SCM_INTERNAL int scm_i_try_narrow_string (SCM str);
 SCM_INTERNAL SCM scm_i_symbol_substring (SCM sym, size_t start, size_t end);
 SCM_INTERNAL scm_t_wchar scm_i_symbol_ref (SCM sym, size_t x);
 SCM_INTERNAL void scm_encoding_error (const char *subr, int err,
-                                     const char *message,
-                                     const char *from, const char *to,
-                                     SCM string_or_bv);
+                                     const char *message, SCM port, SCM chr);
+SCM_INTERNAL void scm_decoding_error (const char *subr, int err,
+                                     const char *message, SCM port);
 
 /* internal utility functions. */
 
diff --git a/module/rnrs/base.scm b/module/rnrs/base.scm
index c81ded1..4c9c51b 100644
--- a/module/rnrs/base.scm
+++ b/module/rnrs/base.scm
@@ -123,8 +123,13 @@
  (define (vector-map proc . vecs)
    (list->vector (apply map (cons proc (map vector->list vecs)))))
 
- (define raise
-   (@ (rnrs exceptions) raise))
+ (define-syntax raise
+   ;; Resolve the real `raise' lazily to avoid a circular dependency
+   ;; between `(rnrs base)' and `(rnrs exceptions)'.
+   (syntax-rules ()
+     ((_ c)
+      ((@ (rnrs exceptions) raise) c))))
+
  (define condition
    (@ (rnrs conditions) condition))
  (define make-error
diff --git a/module/rnrs/io/ports.scm b/module/rnrs/io/ports.scm
index 15d62bd..a5815c8 100644
--- a/module/rnrs/io/ports.scm
+++ b/module/rnrs/io/ports.scm
@@ -93,7 +93,11 @@
           &i/o-file-does-not-exist i/o-file-does-not-exist-error?
           make-i/o-file-does-not-exist-error
           &i/o-port i/o-port-error? make-i/o-port-error
-          i/o-error-port)
+          i/o-error-port
+          &i/o-decoding-error i/o-decoding-error?
+          make-i/o-decoding-error
+          &i/o-encoding-error i/o-encoding-error?
+          make-i/o-encoding-error i/o-encoding-error-char)
   (import (only (rnrs base) assertion-violation)
           (rnrs enums)
           (rnrs records syntactic)
@@ -314,39 +318,83 @@ return the characters accumulated in that port."
 (define (flush-output-port port)
   (force-output port))
 
+
+;;;
+;;; Textual output.
+;;;
+
+(define-condition-type &i/o-encoding &i/o-port
+  make-i/o-encoding-error i/o-encoding-error?
+  (char i/o-encoding-error-char))
+
+(define-syntax with-i/o-encoding-error
+  (syntax-rules ()
+    "Convert Guile throws to `encoding-error' to `&i/o-encoding-error'."
+    ((_ body ...)
+     ;; XXX: This is heavyweight for small functions like `put-char'.
+     (with-throw-handler 'encoding-error
+       (lambda ()
+         (begin body ...))
+       (lambda (key subr message errno port chr)
+         (raise (make-i/o-encoding-error port chr)))))))
+
 (define (put-char port char)
-  (write-char char port))
+  (with-i/o-encoding-error (write-char char port)))
 
 (define (put-datum port datum)
-  (write datum port))
+  (with-i/o-encoding-error (write datum port)))
 
 (define* (put-string port s #:optional start count)
-  (cond ((not (string? s))
-         (assertion-violation 'put-string "expected string" s))
-        ((and start count)
-         (display (substring/shared s start (+ start count)) port))
-        (start
-         (display (substring/shared s start (string-length s)) port))
-        (else
-         (display s port))))
+  (with-i/o-encoding-error
+   (cond ((not (string? s))
+          (assertion-violation 'put-string "expected string" s))
+         ((and start count)
+          (display (substring/shared s start (+ start count)) port))
+         (start
+          (display (substring/shared s start (string-length s)) port))
+         (else
+          (display s port)))))
+
+
+;;;
+;;; Textual input.
+;;;
+
+(define-condition-type &i/o-decoding &i/o-port
+  make-i/o-decoding-error i/o-decoding-error?)
+
+(define-syntax with-i/o-decoding-error
+  (syntax-rules ()
+    "Convert Guile throws to `decoding-error' to `&i/o-decoding-error'."
+    ((_ body ...)
+     ;; XXX: This is heavyweight for small functions like `get-char' and
+     ;; `lookahead-char'.
+     (with-throw-handler 'decoding-error
+       (lambda ()
+         (begin body ...))
+       (lambda (key subr message errno port)
+         (raise (make-i/o-decoding-error port)))))))
 
 (define (get-char port)
-  (read-char port))
+  (with-i/o-decoding-error (read-char port)))
 
 (define (get-datum port)
-  (read port))
+  (with-i/o-decoding-error (read port)))
 
 (define (get-line port)
-  (read-line port 'trim))
+  (with-i/o-decoding-error (read-line port 'trim)))
 
 (define (get-string-all port)
-  (read-delimited "" port 'concat))
+  (with-i/o-decoding-error (read-delimited "" port 'concat)))
 
 (define (lookahead-char port)
-  (peek-char port))
-
+  (with-i/o-decoding-error (peek-char port)))
 
 
+;;;
+;;; Standard ports.
+;;;
+
 (define (standard-input-port)
   (dup->inport 0))
 
diff --git a/test-suite/standalone/test-round.c 
b/test-suite/standalone/test-round.c
index 862e7d0..f1458af 100644
--- a/test-suite/standalone/test-round.c
+++ b/test-suite/standalone/test-round.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2004, 2006, 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2004, 2006, 2008, 2009, 2011 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -71,7 +71,7 @@ test_scm_c_round ()
          table.  */
       if (i != 0)
         {
-#if HAVE_FESETROUND
+#ifdef HAVE_FESETROUND
           fesetround (modes[i]);
 #endif
         }
diff --git a/test-suite/tests/encoding-escapes.test 
b/test-suite/tests/encoding-escapes.test
index ee3fba1..bfffcd2 100644
--- a/test-suite/tests/encoding-escapes.test
+++ b/test-suite/tests/encoding-escapes.test
@@ -67,19 +67,45 @@
 
 (with-test-prefix "display output errors"
 
-  (pass-if-exception "ultima"
-                    exception:encoding-error
-                    (let ((pt (open-output-string)))
-                      (set-port-encoding! pt "ASCII")
-                      (set-port-conversion-strategy! pt 'error)
-                      (display s1 pt)))
-
-  (pass-if-exception "Rashomon"
-                    exception:encoding-error
-                    (let ((pt (open-output-string)))
-                      (set-port-encoding! pt "ASCII")
-                      (set-port-conversion-strategy! pt 'error)
-                      (display s4 pt))))
+  (pass-if "ultima"
+    (let ((pt (open-output-string)))
+      (set-port-encoding! pt "ASCII")
+      (set-port-conversion-strategy! pt 'error)
+      (catch 'encoding-error
+        (lambda ()
+          (display s1 pt)
+          #f)
+        (lambda (key subr message errno port chr)
+          (and (eq? port pt)
+               (char=? chr (string-ref s1 0))
+               (string=? (get-output-string pt) ""))))))
+
+  (pass-if "Rashomon"
+    (let ((pt (open-output-string)))
+      (set-port-encoding! pt "ASCII")
+      (set-port-conversion-strategy! pt 'error)
+      (catch 'encoding-error
+        (lambda ()
+          (display s4 pt)
+          #f)
+        (lambda (key subr message errno port chr)
+          (and (eq? port pt)
+               (char=? chr (string-ref s4 0))
+               (string=? (get-output-string pt) ""))))))
+
+  (pass-if "tekniko"
+    (let ((pt (open-output-string)))
+      (set-port-encoding! pt "ASCII")
+      (set-port-conversion-strategy! pt 'error)
+      (catch 'encoding-error
+        (lambda ()
+          ;; This time encoding should fail on the 3rd character.
+          (display "teĥniko" pt)
+          #f)
+        (lambda (key subr message errno port chr)
+          (and (eq? port pt)
+               (char=? chr #\Ä¥)
+               (string=? "te" (get-output-string pt))))))))
 
 ;; Check that questions marks or substitutions appear when the conversion
 ;; mode is substitute
diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test
index 8d3f672..3104ca1 100644
--- a/test-suite/tests/ports.test
+++ b/test-suite/tests/ports.test
@@ -420,10 +420,8 @@
             (with-output-to-string
               (lambda ()
                 (display str)))))
-        (lambda (key subr message errno from to faulty-str)
-          (and (eq? faulty-str str)
-               (string=? from "UTF-32")
-               (string=? to "ISO-8859-1")
+        (lambda (key subr message errno port chr)
+          (and (eq? chr #\ĉ)
                (string? (strerror errno)))))))
 
   (pass-if "wrong encoding, substitute"
@@ -464,29 +462,80 @@
            (= (port-line p) 0)
            (= (port-column p) 0))))
 
-  (pass-if-exception "read-char, wrong encoding, error"
-    exception:encoding-error
+  (pass-if "read-char, wrong encoding, error"
     (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
-               (open-bytevector-input-port #vu8(255 1 2 3)))))
-      (set-port-conversion-strategy! p 'error)
-      (read-char p)
-      #t))
-
-  (pass-if-exception "read-char, wrong encoding, escape"
-    exception:encoding-error
-    ;; `escape' should behave like `error'.
+               (open-bytevector-input-port #vu8(255 65 66 67)))))
+      (catch 'decoding-error
+        (lambda ()
+          (set-port-conversion-strategy! p 'error)
+          (read-char p)
+          #f)
+        (lambda (key subr message err port)
+          (and (eq? port p)
+
+               ;; PORT should point past the error.
+               (equal? '(#\A #\B #\C)
+                       (list (read-char port)
+                             (read-char port)
+                             (read-char port)))
+
+               (eof-object? (read-char port)))))))
+
+  (pass-if "read-char, wrong encoding, escape"
+    ;; `escape' should behave exactly like `error'.
     (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
-               (open-bytevector-input-port #vu8(255 1 2 3)))))
-      (set-port-conversion-strategy! p 'escape)
-      (read-char p)
-      #t))
+               (open-bytevector-input-port #vu8(255 65 66 67)))))
+      (catch 'decoding-error
+        (lambda ()
+          (set-port-conversion-strategy! p 'escape)
+          (read-char p)
+          #f)
+        (lambda (key subr message err port)
+          (and (eq? port p)
+
+               ;; PORT should point past the error.
+               (equal? '(#\A #\B #\C)
+                       (list (read-char port)
+                             (read-char port)
+                             (read-char port)))
+
+               (eof-object? (read-char port)))))))
 
   (pass-if "read-char, wrong encoding, substitute"
     (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
                (open-bytevector-input-port #vu8(255 206 187 206 188)))))
       (set-port-conversion-strategy! p 'substitute)
       (equal? (list (read-char p) (read-char p) (read-char p))
-              '(#\? #\λ #\μ)))))
+              '(#\? #\λ #\μ))))
+
+  (pass-if "peek-char, wrong encoding, error"
+    (let-syntax ((decoding-error?
+                  (syntax-rules ()
+                    ((_ port exp)
+                     (catch 'decoding-error
+                       (lambda ()
+                         (pk 'exp exp)
+                         #f)
+                       (lambda (key subr message errno p)
+                         (eq? p port)))))))
+      (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
+                 (open-bytevector-input-port #vu8(255 65 66 67)))))
+        (set-port-conversion-strategy! p 'error)
+
+        ;; `peek-char' should repeatedly raise an error.
+        (and (decoding-error? p (peek-char p))
+             (decoding-error? p (peek-char p))
+             (decoding-error? p (peek-char p))
+
+             ;; Move past the error.
+             (decoding-error? p (read-char p))
+
+             ;; Finish happily.
+             (equal? '(#\A #\B #\C)
+                     (list (read-char p)
+                           (read-char p)
+                           (read-char p)))
+             (eof-object? (read-char p)))))))
 
 (with-test-prefix "call-with-output-string"
 
diff --git a/test-suite/tests/r6rs-ports.test b/test-suite/tests/r6rs-ports.test
index 410e9d1..df056a4 100644
--- a/test-suite/tests/r6rs-ports.test
+++ b/test-suite/tests/r6rs-ports.test
@@ -22,6 +22,7 @@
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-11)
   #:use-module (rnrs io ports)
+  #:use-module (rnrs exceptions)
   #:use-module (rnrs bytevectors))
 
 ;;; All these tests assume Guile 1.8's port system, where characters are
@@ -262,16 +263,14 @@
   (pass-if "put-bytevector with wrong-encoding string port"
     (let* ((str "hello, world")
            (bv  (string->utf16 str)))
-      (catch 'encoding-error
+      (catch 'decoding-error
         (lambda ()
           (with-fluids ((%default-port-encoding "UTF-32"))
             (call-with-output-string
               (lambda (port)
                 (put-bytevector port bv)))))
-        (lambda (key subr message errno from to faulty-bv)
-          (and (bytevector=? faulty-bv bv)
-               (string=? to "UTF-32")
-               (string? (strerror errno))))))))
+        (lambda (key subr message errno port)
+          (string? (strerror errno)))))))
 
 
 (with-test-prefix "7.2.7 Input Ports"
@@ -544,21 +543,31 @@
                                 (error-handling-mode raise)))
            (b  (open-bytevector-input-port #vu8(255 2 1)))
            (tp (transcoded-port b t)))
-      ;; FIXME: Should be (guard (c ((i/o-decoding-error? c) #t)) ...).
-      (catch 'encoding-error
-        (lambda ()
-          (get-line tp)
-          #f)
-        (lambda _
-          #t))))
+      (guard (c ((i/o-decoding-error? c)
+                 (eq? (i/o-error-port c) tp)))
+        (get-line tp))))
 
   (pass-if "transcoded-port [error handling mode = replace]"
     (let* ((t  (make-transcoder (utf-8-codec) (native-eol-style)
                                 (error-handling-mode replace)))
            (b  (open-bytevector-input-port #vu8(255 1 2 3 103 110 117)))
            (tp (transcoded-port b t)))
-      (string-suffix? "gnu" (get-line tp)))))
+      (string-suffix? "gnu" (get-line tp))))
+
+  (pass-if "transcoded-port, output [error handling mode = raise]"
+    (let-values (((p get)
+                  (open-bytevector-output-port)))
+      (let* ((t  (make-transcoder (latin-1-codec) (native-eol-style)
+                                  (error-handling-mode raise)))
+             (tp (transcoded-port p t)))
+        (guard (c ((i/o-encoding-error? c)
+                   (and (eq? (i/o-error-port c) tp)
+                        (char=? (i/o-encoding-error-char c) #\λ)
+                        (bytevector=? (get) (string->utf8 "The letter ")))))
+          (put-string tp "The letter λ cannot be represented in Latin-1.")
+          #f)))))
 
 ;;; Local Variables:
 ;;; mode: scheme
+;;; eval: (put 'guard 'scheme-indent-function 1)
 ;;; End:
diff --git a/test-suite/tests/rdelim.test b/test-suite/tests/rdelim.test
index ec7e5b4..dd81dc9 100644
--- a/test-suite/tests/rdelim.test
+++ b/test-suite/tests/rdelim.test
@@ -19,6 +19,7 @@
 
 (define-module (test-suite test-rdelim)
   #:use-module (ice-9 rdelim)
+  #:use-module ((rnrs io ports) #:select (open-bytevector-input-port))
   #:use-module (test-suite lib))
 
 (with-fluids ((%default-port-encoding "UTF-8"))
@@ -63,11 +64,33 @@
                            (read-line p 'split)))
              (eof-object? (read-line p)))))
 
-    (pass-if  "two Greek lines, trim"
+    (pass-if "two Greek lines, trim"
       (let* ((s "λαμβδα\nμυ\n")
              (p (open-input-string s)))
         (and (equal? (string-tokenize s)
                      (list (read-line p) (read-line p)))
+             (eof-object? (read-line p)))))
+
+    (pass-if "decoding error"
+      (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
+                 (open-bytevector-input-port #vu8(65 255 66 67 68)))))
+        (set-port-conversion-strategy! p 'error)
+        (catch 'decoding-error
+          (lambda ()
+            (read-line p)
+            #f)
+          (lambda (key subr message err port)
+            (and (eq? port p)
+
+                 ;; PORT should now point past the error.
+                 (string=? (read-line p) "BCD")
+                 (eof-object? (read-line p)))))))
+
+    (pass-if "decoding error, substitute"
+      (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
+                 (open-bytevector-input-port #vu8(65 255 66 67 68)))))
+        (set-port-conversion-strategy! p 'substitute)
+        (and (string=? (read-line p) "A?BCD")
              (eof-object? (read-line p)))))))
 
 ;;; Local Variables:


hooks/post-receive
-- 
GNU Guile



reply via email to

[Prev in Thread] Current Thread [Next in Thread]