guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

doc regexp-substitute polish


From: Kevin Ryde
Subject: doc regexp-substitute polish
Date: Sat, 11 Dec 2004 11:22:48 +1100
User-agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.3 (gnu/linux)

A bit of polish for regexp-substitute and regexp-substitute/global.
In particular I was looking for a since simple example of search and
replace for regexp-substitute/global.



   Regular expressions are commonly used to find patterns in one string
and replace them with the contents of another string.  The following
functions are convenient ways to do this.

 -- Scheme Procedure: regexp-substitute port match [item...]
     Write to PORT selected parts of the match structure MATCH.  Or if
     PORT is `#f' then form a string from those parts and return that.

     Each ITEM specifies a part to be written, and may be one of the
     following,

        * A string.  String arguments are written out verbatim.

        * An integer.  The submatch with that number is written
          (`match:substring').  Zero is the entire match.

        * The symbol `pre'.  The portion of the matched string preceding
          the regexp match is written (`match:prefix').

        * The symbol `post'.  The portion of the matched string
          following the regexp match is written (`match:suffix').

     For example, changing a match and retaining the text before and
     after,

          (regexp-substitute #f (string-match "[0-9]+" "number 25 is good")
                             'pre "37" 'post)
          => "number 37 is good"

     Or matching a YYYYMMDD format date such as `20020828' and
     re-ordering and hyphenating the fields.

          (define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
          (define s "Date 20020429 12am.")
          (regexp-substitute #f (string-match date-regex s)
                             'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
          => "Date 04-29-2002 12am. (20020429)"

 -- Scheme Procedure: regexp-substitute/global port regexp target
          [item...]
     Write to PORT selected parts of matches of REGEXP in TARGET.  If
     PORT is `#f' then form a string from those parts and return that.
     REGEXP can be a string or a compiled regex.

     This is similar to `regexp-substitute', but allows global
     substitutions on TARGET.  Each ITEM behaves as per
     `regexp-substitute', with the following differences,

        * A function.  Called as `(ITEM match)' with the match
          structure for the REGEXP match, it should return a string to
          be written to PORT.

        * The symbol `post'.  This doesn't output anything, but instead
          causes `regexp-substitute/global' to recurse on the unmatched
          portion of TARGET.

          This _must_ be supplied to perform a global search and
          replace on TARGET; without it `regexp-substitute/global'
          returns after a single match and output.

     For example, to collapse runs of tabs and spaces to a single hyphen
     each,

          (regexp-substitute/global #f "[ \t]+"  "this   is   the text"
                                    'pre "-" 'post)
          => "this-is-the-text"

     Or using a function to reverse the letters in each word,

          (regexp-substitute/global #f "[a-z]+"  "to do and not-do"
            'pre (lambda (m) (string-reverse (match:substring m))) 'post)
          => "ot od dna ton-od"

     Without the `post' symbol, just one regexp match is made.  For
     example the following is the date example from `regexp-substitute'
     above, without the need for the separate `string-match' call.

          (define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
          (define s "Date 20020429 12am.")
          (regexp-substitute/global #f date-regex s
                                    'pre 2 "-" 3 "-" 1 'post " (" 0 ")")

          => "Date 04-29-2002 12am. (20020429)"





reply via email to

[Prev in Thread] Current Thread [Next in Thread]