groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 03/22: [troff]: Fix Savannah #60977.


From: G. Branden Robinson
Subject: [groff] 03/22: [troff]: Fix Savannah #60977.
Date: Tue, 27 Jul 2021 22:34:21 -0400 (EDT)

gbranden pushed a commit to branch master
in repository groff.

commit e876d4bfd193abb9a7d1fb6e76519349bded482a
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Tue Jul 27 16:51:21 2021 +1000

    [troff]: Fix Savannah #60977.
    
    [troff]: Avoid using sprintf() with user-controlled format string.
    
    * src/preproc/html/pre-html.cpp (makeFileName): Add comment noting need
      for implementation synchrony between this function, which generates
      \O5 escape sequences, and troff's suppress_node::tprint member
      function, which interprets them.
    
    * src/roff/troff/node.cpp: Rename 2 static globals for clarity.
      - `last_image_filename` -> `image_filename`
      - `last_image_id` -> `subimage_counter`
    
      (suppress_node::tprint): Set up the buffer for image file name to be
      written using a constant rather than an embedded literal.
      Unconditionally initialize the buffer with a string terminator, so
      there is no chance of a read from uninitialized storage.  Drop unused
      code involving `tem`.  Drop stale comments.  Clarify comment: an
      `image_filename` doesn't _always_ contain a format string, only
      sometimes.  Replace use of `sprintf` with manual construction of a new
      image filename string.  There are two cases, one where a format string
      {presently "%d"} is present, and one where it is not.  If it is
      present, locate it {`percent`}.  This means a limited/bounded image
      ("subimage") is being processed; increment the subimage counter.
      Write a new image file name preserving the parts before and after "%d"
      (the "prefix" and "suffix", and replacing only the middle, using
      `sprintf` with the subimage counter and the (string literal) format.
      Be mindful of string bounds and memory allocation, issuing diagnostics
      or aborting as necessary.  If the image file name does _not_ contain a
      format string, but needs only to be copied, do that (`strcpy`), again
      instead of using `sprintf`.
    
    Fixes <https://savannah.gnu.org/bugs/?60977>.
    
    Implementation note:
    * Why all these standard C library string manipulations?  (1) I'm more
      familiar with C than C++.  (2) The inbound type I'm dealing with,
      {last,}image_filename, is a char*, not a C++ string type.  (3) groff
      does not use a standard C++ library string type, but its own
      implementation[1]; groff is _old_).  (4) That custom string class
      does not have any `printf`-style format methods that I can see.
      Incidentally, groff's own `i_to_a` library function[2] doesn't handle
      long ints, which `size_t` is, so you can't put longs into diagnostic
      messages among other places.  (It looks easy to change `i_to_a` and
      `ui_to_a` to support longs.  You can then watch the build explode into
      a thousand pieces with numeric overflows in troff.)  (5) I did attempt
      to handle the strings efficiently, never calling strcat() since I
      always knew where the end of a string I wanted to append to was[3].
    
    [1] src/include/stringclass.h
    [2] src/libs/libgroff/itoa.c
    [3] https://symas.com/the-sad-state-of-c-strings/
---
 ChangeLog                     | 34 +++++++++++++++++++++
 src/preproc/html/pre-html.cpp |  1 +
 src/roff/troff/node.cpp       | 71 +++++++++++++++++++++++++++++++++----------
 3 files changed, 90 insertions(+), 16 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 034ad19..5197bf2 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,37 @@
+2021-07-27  G. Branden Robinson <g.branden.robinson@gmail.com>
+
+       [troff]: Avoid using sprintf() with user-controlled format
+       string.
+
+       * src/preproc/html/pre-html.cpp (makeFileName): Add comment
+       noting need for implementation synchrony between this function,
+       which generates \O5 escape sequences, and troff's
+       suppress_node::tprint member function, which interprets them.
+       * src/roff/troff/node.cpp: Rename 2 static globals for clarity.
+         - `last_image_filename` -> `image_filename`
+         - `last_image_id` -> `subimage_counter`
+       (suppress_node::tprint): Set up the buffer for image file name
+       to be written using a constant rather than an embedded literal.
+       Unconditionally initialize the buffer with a string terminator,
+       so there is no chance of a read from unitialized storage.  Drop
+       unused code involving `tem`.  Drop stale comments.  Clarify
+       comment: an `image_filename` doesn't _always_ contain a format
+       string, only sometimes.  Replace use of `sprintf` with manual
+       construction of a new image filename string.  There are two
+       cases, one where a format string {presently "%d"} is present,
+       and one where it is not.  If it is present, locate it
+       {`percent`}.  This means a limited/bounded image ("subimage") is
+       being processed; increment the subimage counter.  Write a new
+       image file name preserving the parts before and after "%d" (the
+       "prefix" and "suffix", and replacing only the middle, using
+       `sprintf` with the subimage counter and the (string literal)
+       format.  Be mindful of string bounds and memory allocation,
+       issuing diagnostics or aborting as necessary.  If the image file
+       name does _not_ contain a format string, but needs only to be
+       copied, do that (`strcpy`), again instead of using `sprintf`.
+
+       Fixes <https://savannah.gnu.org/bugs/?60977>.
+
 2021-07-26  G. Branden Robinson <g.branden.robinson@gmail.com>
 
        [grohtml]: Fix Savannah #60971.
diff --git a/src/preproc/html/pre-html.cpp b/src/preproc/html/pre-html.cpp
index 88fcca1..501e90c 100644
--- a/src/preproc/html/pre-html.cpp
+++ b/src/preproc/html/pre-html.cpp
@@ -544,6 +544,7 @@ static void makeFileName(void)
   if (image_template == NULL)
     sys_fatal("malloc");
   strcpy(image_template, macroset_template);
+  // Keep this format string synced with troff:suppress_node::tprint().
   strcat(image_template, "%d");
 }
 
diff --git a/src/roff/troff/node.cpp b/src/roff/troff/node.cpp
index f2fc6da..86b054e 100644
--- a/src/roff/troff/node.cpp
+++ b/src/roff/troff/node.cpp
@@ -4053,8 +4053,8 @@ void suppress_node::put(troff_output_file *out, const 
char *s)
  */
 
 static char last_position = 0;
-static const char *last_image_filename = 0;
-static int last_image_id = 0;
+static const char *image_filename = 0;
+static int subimage_counter = 0;
 
 inline int min(int a, int b)
 {
@@ -4086,23 +4086,62 @@ void suppress_node::tprint(troff_output_file *out)
   if (is_on == 2) {
     // remember position and filename
     last_position = position;
-    char *tem = (char *)last_image_filename;
-    last_image_filename = strsave(filename.contents());
-    if (tem)
-      free(tem);
-    last_image_id = image_id;
-    // printf("start of image and page = %d\n", current_page);
+    image_filename = strsave(filename.contents());
   }
   else {
-    // now check whether the suppress node requires us to issue limits.
+    // Now check whether the suppress node requires us to issue limits.
     if (emit_limits) {
-      char name[8192];
-      // remember that the filename will contain a %d in which the
-      // last_image_id is placed
-      if (last_image_filename == (char *) 0)
-       *name = '\0';
-      else
-       sprintf(name, last_image_filename, last_image_id);
+      const size_t namebuflen = 8192;
+      char name[namebuflen] = { '\0' };
+      // Jump through a flaming hoop to avoid a "format nonliteral"
+      // warning from blindly using sprintf...and avoid trouble from
+      // mischievous image stems.
+      //
+      // Keep this format string synced with pre-html:makeFileName().
+      const char format[] = "%d";
+      const size_t format_len = strlen(format);
+      const char *percent_position = strstr(image_filename, format);
+      if (percent_position) {
+       subimage_counter++;
+       assert(sizeof subimage_counter <= 8);
+       // A 64-bit signed int produces up to 19 decimal digits.
+       char *subimage_number = (char *)malloc(20); // 19 digits + \0
+       if (0 == subimage_number)
+         fatal("memory allocation failure");
+       // Replace the %d in the filename with this number.
+       size_t enough = strlen(image_filename) + 19 - format_len;
+       char *new_name = (char *)malloc(enough);
+       if (0 == new_name)
+         fatal("memory allocation failure");
+       ptrdiff_t prefix_length = percent_position - image_filename;
+       strncpy(new_name, image_filename, prefix_length);
+       sprintf(subimage_number, "%d", subimage_counter);
+       size_t number_length = strlen(subimage_number);
+       strcpy(new_name + prefix_length, subimage_number);
+       // Skip over the format in the source string.
+       const char *suffix_src = image_filename + prefix_length
+         + format_len;
+       char *suffix_dst = new_name + prefix_length + number_length;
+       strcpy(suffix_dst, suffix_src);
+       // Ensure the new string fits with room for a terminal '\0'.
+       const size_t len = strlen(new_name);
+       if (len > (namebuflen - 1))
+         error("constructed file name in suppressed output escape is"
+               " too long (>= %1 bytes); skipping image",
+               (int)namebuflen);
+       else
+         strncpy(name, new_name, (namebuflen - 1));
+       free(new_name);
+       free(subimage_number);
+      }
+      else {
+       const size_t len = strlen(image_filename);
+       if (len > (namebuflen - 1))
+         error("file name in suppressed output escape is too long"
+               " (>= %1 bytes); skipping image", (int)namebuflen);
+       else
+         strcpy(name, image_filename);
+      }
       if (is_html) {
        switch (last_position) {
        case 'c':



reply via email to

[Prev in Thread] Current Thread [Next in Thread]