emacs-27 b07e3b1: Improve format-spec documentation (bug#41571)

From: Basil L. Contovounesios
Subject: emacs-27 b07e3b1: Improve format-spec documentation (bug#41571)
Date: Tue, 2 Jun 2020 15:55:15 -0400 (EDT)

branch: emacs-27
commit b07e3b1d97e73c5cf0cd60edf4838b555530bbf0
Author: Basil L. Contovounesios <contovob@tcd.ie>
Commit: Basil L. Contovounesios <contovob@tcd.ie>

    Improve format-spec documentation (bug#41571)
    * doc/lispref/text.texi (Interpolated Strings): Move from here...
    * doc/lispref/strings.texi (Custom Format Strings): ...to here,
    renaming the node and clarifying the documentation.
    (Formatting Strings): End node with sentence referring to the next
    * lisp/format-spec.el (format-spec): Clarify docstring.
 doc/lispref/strings.texi | 176 +++++++++++++++++++++++++++++++++++++++++++++++
 doc/lispref/text.texi    |  64 -----------------
 lisp/format-spec.el      |  49 ++++++++-----
 3 files changed, 206 insertions(+), 83 deletions(-)

diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 70c3b3c..4a7bda5 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -28,6 +28,7 @@ keyboard character events.
 * Text Comparison::           Comparing characters or strings.
 * String Conversion::         Converting to and from characters and strings.
 * Formatting Strings::        @code{format}: Emacs's analogue of @code{printf}.
+* Custom Format Strings::     Formatting custom @code{format} specifications.
 * Case Conversion::           Case conversion functions.
 * Case Tables::               Customizing case conversion.
 @end menu
@@ -1122,6 +1123,181 @@ may be problematic; for example, @samp{%d} and 
@samp{%g} can mishandle
 NaNs and can lose precision and type, and @samp{#x%x} and @samp{#o%o}
 can mishandle negative integers.  @xref{Input Functions}.
+The functions described in this section accept a fixed set of
+specification characters.  The next section describes a function
+@code{format-spec} which can accept custom specification characters,
+such as @samp{%a} or @samp{%z}.
+@node Custom Format Strings
+@section Custom Format Strings
+@cindex custom format string
+@cindex custom @samp{%}-sequence in format
+Sometimes it is useful to allow users and Lisp programs alike to
+control how certain text is generated via custom format control
+strings.  For example, a format string could control how to display
+someone's forename, surname, and email address.  Using the function
+@code{format} described in the previous section, the format string
+could be something like @w{@code{"%s %s <%s>"}}.  This approach
+quickly becomes impractical, however, as it can be unclear which
+specification character corresponds to which piece of information.
+A more convenient format string for such cases would be something like
+@w{@code{"%f %l <%e>"}}, where each specification character carries
+more semantic information and can easily be rearranged relative to
+other specification characters, making such format strings more easily
+customizable by the user.
+The function @code{format-spec} described in this section performs a
+similar function to @code{format}, except it operates on format
+control strings that use arbitrary specification characters.
+@defun format-spec template spec-alist &optional only-present
+This function returns a string produced from the format string
+@var{template} according to conversions specified in @var{spec-alist},
+which is an alist (@pxref{Association Lists}) of the form
+@w{@code{(@var{letter} . @var{replacement})}}.  Each specification
+@code{%@var{letter}} in @var{template} will be replaced by
+@var{replacement} when formatting the resulting string.
+The characters in @var{template}, other than the format
+specifications, are copied directly into the output, including their
+text properties, if any.  Any text properties of the format
+specifications are copied to their replacements.
+Using an alist to specify conversions gives rise to some useful
+@itemize @bullet
+If @var{spec-alist} contains more unique @var{letter} keys than there
+are unique specification characters in @var{template}, the unused keys
+are simply ignored.
+If @var{spec-alist} contains more than one association with the same
+@var{letter}, the closest one to the start of the list is used.
+If @var{template} contains the same specification character more than
+once, then the same @var{replacement} found in @var{spec-alist} is
+used as a basis for all of that character's substitutions.
+The order of specifications in @var{template} need not correspond to
+the order of associations in @var{spec-alist}.
+@end itemize
+The optional argument @var{only-present} indicates how to handle
+specification characters in @var{template} that are not found in
+@var{spec-alist}.  If it is @code{nil} or omitted, the function
+signals an error.  Otherwise, those format specifications and any
+occurrences of @samp{%%} in @var{template} are left verbatim in the
+output, including their text properties, if any.
+@end defun
+The syntax of format specifications accepted by @code{format-spec} is
+similar, but not identical, to that accepted by @code{format}.  In
+both cases, a format specification is a sequence of characters
+beginning with @samp{%} and ending with an alphabetic letter such as
+Unlike @code{format}, which assigns specific meanings to a fixed set
+of specification characters, @code{format-spec} accepts arbitrary
+specification characters and treats them all equally.  For example:
+(setq my-site-info
+      (list (cons ?s system-name)
+            (cons ?t (symbol-name system-type))
+            (cons ?c system-configuration)
+            (cons ?v emacs-version)
+            (cons ?e invocation-name)
+            (cons ?p (number-to-string (emacs-pid)))
+            (cons ?a user-mail-address)
+            (cons ?n user-full-name)))
+(format-spec "%e %v (%c)" my-site-info)
+     @result{} "emacs 27.1 (x86_64-pc-linux-gnu)"
+(format-spec "%n <%a>" my-site-info)
+     @result{} "Emacs Developers <emacs-devel@@gnu.org>"
+@end group
+@end example
+A format specification can include any number of the following flag
+characters immediately after the @samp{%} to modify aspects of the
+@table @samp
+@item 0
+This flag causes any padding specified by the width to consist of
+@samp{0} characters instead of spaces.
+@item -
+This flag causes any padding specified by the width to be inserted on
+the right rather than the left.
+@item <
+This flag causes the substitution to be truncated on the left to the
+given width, if specified.
+@item >
+This flag causes the substitution to be truncated on the right to the
+given width, if specified.
+@item ^
+This flag converts the substituted text to upper case (@pxref{Case
+@item _
+This flag converts the substituted text to lower case (@pxref{Case
+@end table
+The result of using contradictory flags (for instance, both upper and
+lower case) is undefined.
+As is the case with @code{format}, a format specification can include
+a width, which is a decimal number that appears after any flags.  If a
+substitution contains fewer characters than its specified width, it is
+padded on the left:
+(format-spec "%8a is padded on the left with spaces"
+             '((?a . "alpha")))
+     @result{} "   alpha is padded on the left with spaces"
+@end group
+@end example
+Here is a more complicated example that combines several
+aforementioned features:
+(setq my-battery-info
+      (list (cons ?p "73")      ; Percentage
+            (cons ?L "Battery") ; Status
+            (cons ?t "2:23")    ; Remaining time
+            (cons ?c "24330")   ; Capacity
+            (cons ?r "10.6")))  ; Rate of discharge
+(format-spec "%>^-3L : %3p%% (%05t left)" my-battery-info)
+     @result{} "BAT :  73% (02:23 left)"
+(format-spec "%>^-3L : %3p%% (%05t left)"
+             (cons (cons ?L "AC")
+                   my-battery-info))
+     @result{} "AC  :  73% (02:23 left)"
+@end group
+@end example
+As the examples in this section illustrate, @code{format-spec} is
+often used for selectively formatting an assortment of different
+pieces of information.  This is useful in programs that provide
+user-customizable format strings, as the user can choose to format
+with a regular syntax and in any desired order only a subset of the
+information that the program makes available.
 @node Case Conversion
 @section Case Conversion in Lisp
 @cindex upper case
diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi
index de436fa..a14867e 100644
--- a/doc/lispref/text.texi
+++ b/doc/lispref/text.texi
@@ -58,7 +58,6 @@ the character after point.
                        of another buffer.
 * Decompression::    Dealing with compressed data.
 * Base 64::          Conversion to or from base 64 encoding.
-* Interpolated Strings:: Formatting Customizable Strings.
 * Checksum/Hash::    Computing cryptographic hashes.
 * GnuTLS Cryptography:: Cryptographic algorithms imported from GnuTLS.
 * Parsing HTML/XML:: Parsing HTML and XML.
@@ -4662,69 +4661,6 @@ If optional argument @var{base64url} is non-@code{nil}, 
then padding
 is optional, and the URL variant of base 64 encoding is used.
 @end defun
-@node Interpolated Strings
-@section Formatting Customizable Strings
-It is, in some circumstances, useful to present users with a string to
-be customized that can then be expanded programmatically.  For
-instance, @code{erc-header-line-format} is @code{"%n on %t (%m,%l)
-%o"}, and each of those characters after the percent signs are
-expanded when the header line is computed.  To do this, the
-@code{format-spec} function is used:
-@defun format-spec format specification &optional only-present
-@var{format} is the format specification string as in the example
-above.  @var{specification} is an alist that has elements where the
-@code{car} is a character and the @code{cdr} is the substitution.
-If @var{only-present} is @code{nil}, errors will be signaled if a
-format character has been used that's not present in
-@var{specification}.  If it's non-@code{nil}, that format
-specification is left verbatim in the result.
-@end defun
-Here's a trivial example:
-(format-spec "su - %u %l"
-             `((?u . ,(user-login-name))
-               (?l . "ls")))
-     @result{} "su - foo ls"
-@end example
-In addition to allowing padding/limiting to a certain length, the
-following modifiers can be used:
-@table @asis
-@item @samp{0}
-Pad with zeros instead of the default spaces.
-@item @samp{-}
-Pad to the right.
-@item @samp{^}
-Use upper case.
-@item @samp{_}
-Use lower case.
-@item @samp{<}
-If the length needs to be limited, remove characters from the left.
-@item @samp{>}
-Same as previous, but remove characters from the right.
-@end table
-If contradictory modifiers are used (for instance, both upper and
-lower case), then what happens is undefined.
-As an example, @samp{"%<010b"} means ``insert the @samp{b} expansion,
-but pad with leading zeros if it's less than ten characters, and if
-it's more than ten characters, shorten by removing characters from the
 @node Checksum/Hash
 @section Checksum/Hash
 @cindex MD5 checksum
diff --git a/lisp/format-spec.el b/lisp/format-spec.el
index f418cea..9278bd7 100644
--- a/lisp/format-spec.el
+++ b/lisp/format-spec.el
@@ -29,35 +29,46 @@
 (defun format-spec (format specification &optional only-present)
   "Return a string based on FORMAT and SPECIFICATION.
-FORMAT is a string containing `format'-like specs like \"su - %u %k\",
-while SPECIFICATION is an alist mapping from format spec characters
-to values.
+FORMAT is a string containing `format'-like specs like \"su - %u %k\".
+SPECIFICATION is an alist mapping format specification characters
+to their substitutions.
 For instance:
   (format-spec \"su - %u %l\"
-               `((?u . ,(user-login-name))
+               \\=`((?u . ,(user-login-name))
                  (?l . \"ls\")))
-Each format spec can have modifiers, where \"%<010b\" means \"if
-the expansion is shorter than ten characters, zero-pad it, and if
-it's longer, chop off characters from the left side\".
+Each %-spec may contain optional flag and width modifiers, as
-The following modifiers are allowed:
+  %<flags><width>character
-* 0: Use zero-padding.
-* -: Pad to the right.
-* ^: Upper-case the expansion.
-* _: Lower-case the expansion.
-* <: Limit the length by removing chars from the left.
-* >: Limit the length by removing chars from the right.
+The following flags are allowed:
-Any text properties on a %-spec itself are propagated to the text
-that it generates.
+* 0: Pad to the width, if given, with zeros instead of spaces.
+* -: Pad to the width, if given, on the right instead of the left.
+* <: Truncate to the width, if given, on the left.
+* >: Truncate to the width, if given, on the right.
+* ^: Convert to upper case.
+* _: Convert to lower case.
-If ONLY-PRESENT, format spec characters not present in
-SPECIFICATION are ignored, and the \"%\" characters are left
-where they are, including \"%%\" strings."
+The width modifier behaves like the corresponding one in `format'
+when applied to %s.
+For example, \"%<010b\" means \"substitute into the output the
+value associated with ?b in SPECIFICATION, either padding it with
+leading zeros or truncating leading characters until it's ten
+characters wide\".
+Any text properties of FORMAT are copied to the result, with any
+text properties of a %-spec itself copied to its substitution.
+ONLY-PRESENT indicates how to handle %-spec characters not
+present in SPECIFICATION.  If it is nil or omitted, emit an
+error; otherwise leave those %-specs and any occurrences of
+\"%%\" in FORMAT verbatim in the result, including their text
+properties, if any."
     (insert format)
     (goto-char (point-min))

