poke-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2] libpoke: Enable octal and hex \-sequence in string literals


From: Mohammad-Reza Nabipoor
Subject: [PATCH v2] libpoke: Enable octal and hex \-sequence in string literals
Date: Tue, 3 Nov 2020 02:38:15 +0330

2020-11-03  Mohammad-Reza Nabipoor  <m.nabipoor@yahoo.com>

        * libpoke/pkl-trans.c (pkl_trans1_ps_string): Add support for octal
        and hexadecimal escape sequences.
        * doc/poke.texi (String Literals): Document new escape sequences.
        * testsuite/poke.pkl/strings-esc-1.pk: New test.
        * testsuite/poke.pkl/strings-esc-2.pk: Likewise.
        * testsuite/poke.pkl/strings-esc-diag-1.pk: Likewise.
        * testsuite/poke.pkl/strings-esc-diag-2.pk: Likewise.
        * testsuite/poke.pkl/strings-esc-diag-3.pk: Likewise.
        * testsuite/poke.pkl/strings-esc-diag-4.pk: Likewise.
        * testsuite/poke.pkl/string-diag-1.pk: `\x` is now a valid sequence;
        use `\z` as an invalid escape sequence.
        * testsuite/Makefile.am (EXTRA_DIST): Add new tests.
---

Hi, Jose.

Thanks for your answers and comments.

Changes from version 1:
  - Added documentation in `poke.texi`.
  - Updateed tests to add more normal characters.
  - Changed interpretation of octal escape sequences.
    Old behavior: `"\400" == " 0"` (space character + 0 character).
        Because 0o400 does not fit inside a byte it treats the last digit
        as a separate character (not part of the octal number).
        This behavior was incompatible with C compilers (gcc emits a warning
        and clang emits an error). I droped this "feature" in favor of an
        error.
    New behavior: `"\400"` (error: octal escape sequence out of range).

Now two more questions :)
  - Is documentation understandable?
  - Do you like the new behavior regarding octal escape sequences?


Regards,
Mohammad-Reza


 ChangeLog                                |  15 +++
 doc/poke.texi                            |  11 ++
 libpoke/pkl-trans.c                      | 126 ++++++++++++++++++-----
 testsuite/Makefile.am                    |   6 ++
 testsuite/poke.pkl/string-diag-1.pk      |   2 +-
 testsuite/poke.pkl/strings-esc-1.pk      |  25 +++++
 testsuite/poke.pkl/strings-esc-2.pk      |  25 +++++
 testsuite/poke.pkl/strings-esc-diag-1.pk |   3 +
 testsuite/poke.pkl/strings-esc-diag-2.pk |   3 +
 testsuite/poke.pkl/strings-esc-diag-3.pk |   5 +
 testsuite/poke.pkl/strings-esc-diag-4.pk |   5 +
 11 files changed, 200 insertions(+), 26 deletions(-)
 create mode 100644 testsuite/poke.pkl/strings-esc-1.pk
 create mode 100644 testsuite/poke.pkl/strings-esc-2.pk
 create mode 100644 testsuite/poke.pkl/strings-esc-diag-1.pk
 create mode 100644 testsuite/poke.pkl/strings-esc-diag-2.pk
 create mode 100644 testsuite/poke.pkl/strings-esc-diag-3.pk
 create mode 100644 testsuite/poke.pkl/strings-esc-diag-4.pk

diff --git a/ChangeLog b/ChangeLog
index 9098ddd2..9cf32a73 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,18 @@
+2020-11-03  Mohammad-Reza Nabipoor  <m.nabipoor@yahoo.com>
+
+       * libpoke/pkl-trans.c (pkl_trans1_ps_string): Add support for octal
+       and hexadecimal escape sequences.
+       * doc/poke.texi (String Literals): Document new escape sequences.
+       * testsuite/poke.pkl/strings-esc-1.pk: New test.
+       * testsuite/poke.pkl/strings-esc-2.pk: Likewise.
+       * testsuite/poke.pkl/strings-esc-diag-1.pk: Likewise.
+       * testsuite/poke.pkl/strings-esc-diag-2.pk: Likewise.
+       * testsuite/poke.pkl/strings-esc-diag-3.pk: Likewise.
+       * testsuite/poke.pkl/strings-esc-diag-4.pk: Likewise.
+       * testsuite/poke.pkl/string-diag-1.pk: `\x` is now a valid sequence;
+       use `\z` as an invalid escape sequence.
+       * testsuite/Makefile.am (EXTRA_DIST): Add new tests.
+
 2020-11-01  Egeyar Bagcioglu  <egeyar@gmail.com>
 
        * libpoke/ios-dev-file.h: Include errno.h.
diff --git a/doc/poke.texi b/doc/poke.texi
index fb3aad87..16d81f5c 100644
--- a/doc/poke.texi
+++ b/doc/poke.texi
@@ -6133,8 +6133,19 @@ Denotes an horizontal tab.
 Denotes a backlash @code{\} character.
 @item \"
 Denotes a double-quote @code{"} character.
+@item \o
+@item \oo
+@item \ooo
+Denotes an octal number (@code{o} represents an octal digit).
+The number should be less than 256 (0o400).
+@item \xX
+@item \xXX
+Denotes a hexadecimal number (@code{X} represents a hexadecimal digit).
 @end table
 
+Due to the fact that strings in Poke are @code{NULL}-terminated, inserting
+a @code{NULL} character using escape sequence leads to compilation error.
+
 @node String Types
 @subsection String Types
 
diff --git a/libpoke/pkl-trans.c b/libpoke/pkl-trans.c
index 18366ca0..6f414e8a 100644
--- a/libpoke/pkl-trans.c
+++ b/libpoke/pkl-trans.c
@@ -20,6 +20,7 @@
 
 #include <gettext.h>
 #define _(str) gettext (str)
+#include <ctype.h>
 #include <stdio.h>
 #include <string.h>
 #include <xalloc.h>
@@ -304,6 +305,10 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
   size_t string_length, i;
   bool found_backslash = false;
 
+#define ISODIGIT(c) ((unsigned)(c) - '0' < 8) // is octal digit
+#define XDIGIT(x) \
+  ((unsigned)(x) - '0' < 10 ? (x) - '0' : ((x) | 0x20) - 'a' + 10)
+
   /* Please keep this code in sync with the string printer in
      pvm-val.c:pvm_print_val.  */
 
@@ -311,27 +316,45 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
      \-expansion, and report errors in the contents of the string.  */
   for (p = string_pointer, string_length = 0; *p != '\0'; ++p)
     {
-      if (p[0] == '\\')
+      string_length++;
+      if (*p != '\\')
+        continue;
+
+      found_backslash = true;
+      ++p;
+
+      if (ISODIGIT (p[0]))
+        {
+          if (ISODIGIT (p[1]))
+            p += ISODIGIT (p[2]) ? 2 : 1; // reason: 0377 == 0xff
+          continue;
+        }
+
+      switch (*p)
         {
-          switch (p[1])
+        case '\\':
+        case 'n':
+        case 't':
+        case '"':
+          break;
+        case 'x':
+          ++p;
+          if (!isxdigit (p[0]))
             {
-            case '\\':
-            case 'n':
-            case 't':
-            case '"':
-              string_length++;
-              break;
-            default:
               PKL_ERROR (PKL_AST_LOC (string),
-                         "invalid \\%c sequence in string", p[1]);
+                         _ ("\\x used with no following hex digits"));
               PKL_TRANS_PAYLOAD->errors++;
               PKL_PASS_ERROR;
             }
-          p++;
-          found_backslash = true;
+          if (isxdigit (p[1]))
+            ++p;
+          break;
+        default:
+          PKL_ERROR (PKL_AST_LOC (string),
+                     _ ("invalid \\%c sequence in string"), *p);
+          PKL_TRANS_PAYLOAD->errors++;
+          PKL_PASS_ERROR;
         }
-      else
-        string_length++;
     }
 
   if (!found_backslash)
@@ -342,24 +365,77 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
 
   for (p = string_pointer, i = 0; *p != '\0'; ++p, ++i)
     {
-      if (p[0] == '\\')
+      if (*p != '\\')
+        {
+          new_string_pointer[i] = *p;
+          continue;
+        }
+      ++p;
+
+      // octal escape sequence
+      if (ISODIGIT (p[0]))
         {
-          switch (p[1])
+          unsigned int num = p[0] - '0';
+
+          if (ISODIGIT (p[1]))
             {
-            case '\\': new_string_pointer[i] = '\\'; break;
-            case 'n':  new_string_pointer[i] = '\n'; break;
-            case 't':  new_string_pointer[i] = '\t'; break;
-            case '"':  new_string_pointer[i] = '"'; break;
-            default:
-              assert (0);
+              ++p;
+              num = (num << 3) | (p[0] - '0');
+              if (ISODIGIT (p[1]))
+                {
+                  ++p;
+                  num = (num << 3) | (p[0] - '0');
+                }
             }
-          p++;
+          if (num == '\0')
+            {
+              PKL_ERROR (PKL_AST_LOC (string),
+                         _ ("string literal cannot contain NULL character"));
+              PKL_TRANS_PAYLOAD->errors++;
+              PKL_PASS_ERROR;
+            }
+          else if (num > 255)
+            {
+              PKL_ERROR (PKL_AST_LOC (string),
+                         _ ("octal escape sequence out of range"));
+              PKL_TRANS_PAYLOAD->errors++;
+              PKL_PASS_ERROR;
+            }
+          new_string_pointer[i] = num;
+          continue;
+        }
+
+      switch (*p)
+        {
+        case '\\': new_string_pointer[i] = '\\'; break;
+        case 'n': new_string_pointer[i] = '\n'; break;
+        case 't': new_string_pointer[i] = '\t'; break;
+        case '"': new_string_pointer[i] = '"'; break;
+        case 'x':
+          ++p;
+          new_string_pointer[i] = XDIGIT (p[0]);
+          if (isxdigit (p[1]))
+            {
+              new_string_pointer[i] = (XDIGIT (p[0]) << 4) | XDIGIT (p[1]);
+              ++p;
+            }
+          if (new_string_pointer[i] == '\0')
+            {
+              PKL_ERROR (PKL_AST_LOC (string),
+                         _ ("string literal cannot contain NULL character"));
+              PKL_TRANS_PAYLOAD->errors++;
+              PKL_PASS_ERROR;
+            }
+          break;
+        default:
+          assert (0);
         }
-      else
-        new_string_pointer[i] = p[0];
     }
   new_string_pointer[i] = '\0';
 
+#undef ISODIGIT
+#undef XDIGIT
+
   free (string_pointer);
   PKL_AST_STRING_POINTER (string) = new_string_pointer;
 }
diff --git a/testsuite/Makefile.am b/testsuite/Makefile.am
index 5ba9900a..7b0a7c7c 100644
--- a/testsuite/Makefile.am
+++ b/testsuite/Makefile.am
@@ -1370,6 +1370,12 @@ EXTRA_DIST = \
   poke.pkl/strings-2.pk \
   poke.pkl/strings-3.pk \
   poke.pkl/strings-4.pk \
+  poke.pkl/strings-esc-1.pk \
+  poke.pkl/strings-esc-2.pk \
+  poke.pkl/strings-esc-diag-1.pk \
+  poke.pkl/strings-esc-diag-2.pk \
+  poke.pkl/strings-esc-diag-3.pk \
+  poke.pkl/strings-esc-diag-4.pk \
   poke.pkl/strings-index-1.pk \
   poke.pkl/strings-index-2.pk \
   poke.pkl/strings-index-diag-1.pk \
diff --git a/testsuite/poke.pkl/string-diag-1.pk 
b/testsuite/poke.pkl/string-diag-1.pk
index 877bf6ad..1daf5c4e 100644
--- a/testsuite/poke.pkl/string-diag-1.pk
+++ b/testsuite/poke.pkl/string-diag-1.pk
@@ -2,4 +2,4 @@
 
 /* Invalid \-sequence in string.  */
 
-defvar s = "foo\xbar"; /* { dg-error "" } */
+defvar s = "foo\zbar"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-1.pk 
b/testsuite/poke.pkl/strings-esc-1.pk
new file mode 100644
index 00000000..35dd6a11
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-1.pk
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+
+/* Octal escape sequence */
+
+defvar s = "\1\12\1234;\12==\n";
+
+/* { dg-command { .set obase 8 } } */
+
+/* { dg-command { s'length } } */
+/* { dg-output "0o11UL" } */
+
+/* { dg-command {s[0]} } */
+/* { dg-output "\n0o1UB" } */
+
+/* { dg-command {s[1]} } */
+/* { dg-output "\n0o12UB" } */
+
+/* { dg-command {s[2]} } */
+/* { dg-output "\n0o123UB" } */
+
+/* { dg-command {s[3]} } */
+/* { dg-output "\n0o64UB" } */
+
+/* { dg-command {s[4:]} } */
+/* { dg-output "\n\";\\\\n==\\\\n\"" } */
diff --git a/testsuite/poke.pkl/strings-esc-2.pk 
b/testsuite/poke.pkl/strings-esc-2.pk
new file mode 100644
index 00000000..35696ea0
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-2.pk
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+
+/* Hexadecimal escape sequence */
+
+defvar s = "He\x1llo \x12\x123, World!";
+
+/* { dg-command {.set obase 16} } */
+
+/* { dg-command {s'length} } */
+/* { dg-output "0x12UL" } */
+
+/* { dg-command {s[2]} } */
+/* { dg-output "\n0x1UB" } */
+
+/* { dg-command {s[7]} } */
+/* { dg-output "\n0x12UB" } */
+
+/* { dg-command {s[8]} } */
+/* { dg-output "\n0x12UB" } */
+
+/* { dg-command {s[9]} } */
+/* { dg-output "\n0x33UB" } */
+
+/* { dg-command {s[0:1] + s[3:5] + s[10:]} } */
+/* { dg-output "\n\"Hello, World!\"" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-1.pk 
b/testsuite/poke.pkl/strings-esc-diag-1.pk
new file mode 100644
index 00000000..c51cbccd
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-1.pk
@@ -0,0 +1,3 @@
+/* { dg-do compile } */
+
+defvar s = "a\0b"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-2.pk 
b/testsuite/poke.pkl/strings-esc-diag-2.pk
new file mode 100644
index 00000000..f4934e36
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-2.pk
@@ -0,0 +1,3 @@
+/* { dg-do compile } */
+
+defvar s = "x\x0z"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-3.pk 
b/testsuite/poke.pkl/strings-esc-diag-3.pk
new file mode 100644
index 00000000..96f5a513
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-3.pk
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+
+/* Invalid \-sequence */
+
+defvar s = "a\8b"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-4.pk 
b/testsuite/poke.pkl/strings-esc-diag-4.pk
new file mode 100644
index 00000000..241f301e
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-4.pk
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+
+/* Invalid octal \-sequence */
+
+defvar s = "\400"; /* { dg-error "" } */
-- 
2.28.0



reply via email to

[Prev in Thread] Current Thread [Next in Thread]