Re: [PATCH v3] docs/match: pattern matcher example makeover

guile-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] docs/match: pattern matcher example makeover

From:	Maxime Devos
Subject:	Re: [PATCH v3] docs/match: pattern matcher example makeover
Date:	Wed, 1 Feb 2023 17:40:23 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0



On 01-02-2023 14:09, Blake Shaw wrote:

 [...]
-
style: clean-up newlines
--
It appears that while the PDF needs additional newlines
to be presentable, these appear to have a negative effect
on the presentation of the texinfo doc.

I don't know how to fix this, but from looking at the PDF,
it appears that the strategy until now has been to privilege
texinfo at the expense of PDF readability (the PDF is more
or less "squished together")

So in that regard, these edits make my past edits more in sync
with past Guile docs.

IIRC, Texinfo has a @iftex @endif construct or such. You could use thisto define a @pdf-newline macro, to only insert newlines in the PDF (TeXis used for the PDF).

-
examples: replace with didactic ex. that can copied & pasted
--
The existing example can't be copied and pasted.

This example both fixes the past one and improves on its relation
to the text.

-
style: switch to "Indiana style", bracketing lets and clauses
--
After spending much time looking at the examples in black & white
to edit the texinfo document, it occurred to me just how much the
brackets improve legibility. Therefore, I have decided to adopt
the "Indiana" style of using brackets, used by Kent Dybvig, Dan
Friedman and Will Byrd at Indiana University.

Currently the docs use this style in some places but not in others.

Considering some are color blind, and that few will have rigged
their texinfo configuration to use rainbow-delimiters in the while
reading documentation, I think this should be considered a general
accessibility improvement.


IME, (( )) is quite readable (and I don't use rainbow delimiters).
That might largely be 'due to experience', though.  While I would

expect ([ ] [ ]) to be unconventional for many Guilers, it should bereadable too though, so I suppose it could be good to just change theconvention, then.

You are currently making the manual more inconsistent by using this (forGuile) mostly non-standard notation though; IIRC the manual mostly does(( )) and not ([ ]). Yet, in the review of the v1, you mentioned

No, I'm not, I'm being totally boring and normal in this regard because 
collectively authored documentation is something you should never adopt 
non-standard writing notation in the course of authoring, just to one up 
someone on a mailing list

To be honest, it's this kind of attitude that has resulted in the current docs 
that so many people find utterly incomprehensible. The core point of my talk 
that what makes Info Guile so hard to read is the lack of stylistic 
consistency. Editors and editing exist for a very good reason.

, which is very much against non-standard notation and for consistency.As such, I propose:


  a) Before (or after) this patch, change everything in the manual to
     "Indiana style", for consistency.  If you go for 'after this
     patch', I mean immediately afterwards, because Guile contributors
     tend to come and go, and delaying things tends to become never
     doing things.

  b) or: do it in non-Indiana style (likely not the option you will
     take, but it would be more stylistically consistent than the
     current version of the patch ...)

  c) or: don't adjust everything in the manual to Indiana style yet,
     but also make it a rule that the manual (and Guile code in Guile
     proper, I guess) does Indiana style, and that all current
     deviations from Indiana style are old style to be updated in the
     future.

     If this were Guix, you could make this a rule by adding it
     to the "Contributing" section.  Guile does not have appear to have
     such a section, but "1.8 Typograhical Conventions" might be a good
     place.

Additionally, changing the parenthesis convention in Guile is not just achange to the 'match' documentation, but the subject line only mentions'match'. While Indiana styles seems a good thing to me at first sightnow you mention the benefits, it needs a separate e-mail thread suchthat people interested in ()/[] stuff but not in 'match' stuff will havean opportunity to respond.

indentation: make consistent according to rule defined below

If a new paragraph opens onto a new topic, it should naturally
indent (i.e, no indentation markup is required)

If a new paragraph is a continuation of the current subject,
the markup @noident should be applied

markup: replace @var with @code unless @var is a @defn argument

The way that it renders in texinfo means that it renders @vars
in uppercase, the way that is conventionally done for definition
arguments.

I'm not too familiar with Texinfo PDF output but I'll take your word forit. However, this is not the case at least for HTML output, as you cansee at<https://www.gnu.org/software/texinfo/manual/texinfo/html_node/_0040var.html>,for HTML documentation it remains lowercase.

Therefore I've changed all @vars to @code unless @var is a @defn
argument

I'm missing what you mean with the 'Therefore'. How does this relate toyour previous paragraph (I don't get what your point is about'definition arguments')? Do you mean that uppercase @var bad and thatit should be lowercase instead? If so, it would be better to modifyTexinfo itself to let @var not change the case, then every manual inTexinfo would benefit instead of only the Guile manual.

Also, you could ask the Texinfo people if there is a reason foruppercase @var; maybe they determined that it is more readable to morepeople (I'm just speculating, I don't know the reason)? -- Presumablythere's some good reason (or maybe not, I don't know, but you could askthem first).

Otherwise, if you make this Guile-specific change, you would createstylistical inconsistencies between projects using Texinfo. Morespecifically, you are creating stylistical inconsisencies between GNUprojects.

Additionally, you are not merely removing the uppercasing thing, you arealso removing the 'slanted' thing -- the result of @var is slantedtypewriter, the result of @code is merely typewriter, which makes itslightly harder to distinguish metavariables from other code.

You are also only making this stylistical change in the documentation of'match'; the remainder of the manual still has the old @var. If youchange tings, it would be better to change things for the whole manual.I think you can do this by redefining the @var macro to whatever youwant in the prelude (at least that can be done in TeX).

-
remove: paragraph that referred to a since removed example
--

fix: uncomment @xref{sxml-match}
---
  doc/ref/match.texi | 252 ++++++++++++++++++++++++++++++---------------
  1 file changed, 167 insertions(+), 85 deletions(-)

diff --git a/doc/ref/match.texi b/doc/ref/match.texi
index f5ea43118..4e657b976 100644
--- a/doc/ref/match.texi
+++ b/doc/ref/match.texi
@@ -23,71 +23,142 @@ The @code{(ice-9 match)} module provides a @dfn{pattern 
matcher},
  written by Alex Shinn, and compatible with Andrew K. Wright's pattern
  matcher found in many Scheme implementations.

-@cindex pattern variable

-A pattern matcher can match an object against several patterns and
-extract the elements that make it up.  Patterns can represent any Scheme
-object: lists, strings, symbols, records, etc.  They can optionally contain
-@dfn{pattern variables}.  When a matching pattern is found, an
-expression associated with the pattern is evaluated, optionally with all
-pattern variables bound to the corresponding elements of the object:
+@noindent A pattern matcher does precisely what the name implies: it
+matches some arbitrary pattern, and returns some result accordingly.

Again, as I mentioned previously, in the general case it matchesarbitrary patterns (plural) and returns results (plural) -- the 'match'construct is not as limited as you are implying it to be here.

@example

-(let ((l '(hello (world))))
-  (match l           ;; <- the input object
-    (('hello (who))  ;; <- the pattern
-     who)))          ;; <- the expression evaluated upon matching
-@result{} world
+(define (english-base-ten->number name)
+  (match name
+    ('zero   0)
+    ('one    1)
+    ('two    2)
+    ('three  3)
+    ('four   4)
+    ('five   5)
+    ('six    6)
+    ('seven  7)
+    ('eight  8)
+    ('nine   9)))
+
+(english-base-ten->number 'six)
+@result{} 6


My previous comment still applies:

This is a suboptimal example; this would be better done with 'case'.
I propose replacing it with another example, or adding a note that one would normally use 'case' for this.


still applies.  What is the reason for not doing something akin to that?

+
+(apply + (map english-base-ten->number '(one two three four)))
+@result{} 10
  @end example

-In this example, list @var{l} matches the pattern @code{('hello (who))},

-because it is a two-element list whose first element is the symbol
-@code{hello} and whose second element is a one-element list.  Here
-@var{who} is a pattern variable.  @code{match}, the pattern matcher,
-locally binds @var{who} to the value contained in this one-element
-list---i.e., the symbol @code{world}.  An error would be raised if
-@var{l} did not match the pattern.
+@page
+@cindex pattern variable
+@noindent Pattern matchers may contain @dfn{pattern variables},
+local bindings to all elements that match a pattern.

'Pattern matchers' -> 'pattern' would be more precise here, as it moreprecisely states _where_ the pattern variable is. E.g. if you say'pattern', it's certainly not the 'ns' in (match ns ...). If you say'pattern matcher' (*), then 'pattern matcher' might mean 'match' itself,or (match ns ...); the former does not contain a pattern variable, thelatter likely does but less is stated about _where_ the pattern variableis, purely going by your sentence it moght be the 'match' which isincorrect.

(*) While the original text defined 'pattern matcher=match', that partdoesn't contain any pattern variables, and in your new text the notionis of 'pattern matcher' is not exactly defined but rather described, andnot as some kind of precise characterisation.

-The same object can be matched against a simpler pattern:
+@example
+(let re ([ns '(one two three four 9)] [total 0])

The Scheme convention would to be to write 'loop' instead of 're' whenusing named-let, and something like 'rest' instead of 'ns'. The exactword for the loop argument varies a lot, but two letters that don'tappear to mean anything are to be avoided.

+  (match ns
+    [(e) (+ total (english-base-ten->number e))]
+    [(e . es)
+     (re es (+ total (english-base-ten->number e)))]))


I tried running your example, and it doesn't work:

(define (english-base-ten->number name)
  (match name
    ('zero   0)
    ('one    1)
    ('two    2)
    ('three  3)
    ('four   4)
    ('five   5)
    ('six    6)
    ('seven  7)
    ('eight  8)
    ('nine   9)))
(let re ([ns '(one two three four 9)] [total 0])
  (match ns
    [(e) (+ total (english-base-ten->number e))]
    [(e . es)
     (re es (+ total (english-base-ten->number e)))]))
ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `match-error' with args `("match" "no matching pattern" 9)'.

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.

I think you need to replace (one two three four 9) by (one two threefour nine). As you mentioned yourself (in other words), examples in themanual should actually work as-is.

-@example
-(let ((l '(hello (world))))
-  (match l
-    ((x y)
-     (values x y))))
-@result{} hello
-@result{} (world)
+@result{} 19
  @end example

-Here pattern @code{(x y)} matches any two-element list, regardless of

-the types of these elements.  Pattern variables @var{x} and @var{y} are
-bound to, respectively, the first and second element of @var{l}.
-
-Patterns can be composed, and nested.  For instance, @code{...}
+@noindent In this example, the list @code{ns} matches the pattern
+@code{(e . es)}, where the pattern variable @code{e} corresponds
+to the metaphoical "car" of @code{ns} and the pattern variable @code{es}
+corresponds to the "cdr" of @code{ns}.


Typo: metaphoical -> metaphorical.

Also: metaphorical -> literal. -- e is literally the car of ns (or‘corresponds to the car of ns in a literal way’ if you go for avariable/value distinction); there is nothing figurative here. I wouldjust drop the metaphorical/literal word. Also, "car" -> `car' and "cdr"-> `cdr' -- the manual currently consistently uses the quotation style‘car’ / ‘pair?’, ‘SCM’, ..., not "car". For example, in 5.4.1 DynamicTypes, there is the paragraph:

In order to implement standard Scheme functions like ‘pair?’ and
‘string?’ and provide garbage collection, the representation of every
value must contain enough information to accurately determine its type
at run time.

'Function' -> 'Procedure'. You are introducing a stylisticalinconsistency here. In Guile, the C things are called 'Functions', andthe Scheme things are called 'Procedures'. To some degree, this ‘inScheme it's called a procedure’ also holds for other Schemes IIUC.

Actually, while some GC do require runtime type information (RTI), RTIis not needed for garbage collection. Guix uses Boehm-GC for garbagecollection. Being a conservative garbage collector, it doesn't need anytype information. It works a little better if you do give it some typeinformation, and Guile does give it some information in some cases, butit's not required.

This information is therefore incorrect and needs to be removed, but thebits about predicates seems fine to me.

Often, Scheme systems also use this information to
determine whether a program has attempted to apply an operation to an
inappropriately typed value (such as taking the ‘car’ of a string).

IIUC, in Texinfo, we write `stuff' instead of ‘stuff’, and it will getturned in ‘stuff’. I dunno why this is still done in the Guile manualas UTF-8 is an established thing, but I have used ‘’ in Guix stuff inthe past and people changed into `'.

Additionally, doing "git grep -F "car" doc/ref/*.texi", it appears thatthe manual doesn't actually quote car and cdr -- instead it writes carand cdr unquoted, or writes @code{car} / @code{cdr} which happens to beturned into a quoted ‘car’ / ‘cdr’ in the .info documentation by Texinfo.

I think you can guess what I would be saying about stylistic consistencyhere.

+
+@noindent A tail call @code{re} is then initiated

‘A tail call @code{re} is then initiated’ -> ‘A tail call to @code{re}is the initiated’ -- @code{re} is a variable reference, not a tail call.The tail call is @code{(re es (+ to total ...))}.


More simply, you could write ‘The procedure @var{re} is then tail-called’.

+and we "cdr" down the
+list by recurring on the tail @code{es}, applying our matcher
+@code{english-base-ten->number} to each element of @code{ns} until
+only a single element @code{(e)} remains, causing the @code{total}
+to be computed.  In modern Scheme programming it is common to use
+@code{match} in place of the more verbose but familiar combination
+of @code{cond}, @code{car} and @code{cdr}, so it's important to
+understand how these idioms translate.
+
+Patterns can be composed and nested.  For instance, @code{...}
  (ellipsis) means that the previous pattern may be matched zero or more
  times in a list:
  @example
-(match lst
-  (((heads tails ...) ...)
-   heads))
+(match '((a.0 b.0 c.0 ((1.0 2.0 3.0) x.0 y.0 z.0))
+         (a.1 b.1 c.1 ((1.1 2.1 3.1) x.1 y.1 z.1)))
+  [((heads ... ((tails ...) . rest)) ...)
+   (begin
+    (format #t "heads: ~a ~%" heads)
+    (format #t "tails: ~a ~%" tails)
+    (format #t "rest:  ~a ~%" rest))])
+@result{}
+heads: ((a.0 b.0 c.0) (a.1 b.1 c.1))
+tails: ((1.0 2.0 3.0) (1.1 2.1 3.1))
+rest:  ((x.0 y.0 z.0) (x.1 y.1 z.1))
  @end example

-@noindent

-This expression returns the first element of each list within @var{lst}.
-For proper lists of proper lists, it is equivalent to @code{(map car
-lst)}.  However, it performs additional checks to make sure that
-@var{lst} and the lists therein are proper lists, as prescribed by the
-pattern, raising an error if they are not.
-
-Compared to hand-written code, pattern matching noticeably improves
-clarity and conciseness---no need to resort to series of @code{car} and
-@code{cdr} calls when matching lists, for instance.  It also improves
-robustness, by making sure the input @emph{completely} matches the
-pattern---conversely, hand-written code often trades robustness for
-conciseness.  And of course, @code{match} is a macro, and the code it
-expands to is just as efficient as equivalent hand-written code.
-
-The pattern matcher is defined as follows:
+@noindent A pattern matcher can match an object against several
+patterns and extract the elements that make it up.
+
+@example
+(match '((l1 . r1) (l2 . r2) (l3 . r3))
+  [((left . right) ...)
+   (list left right)])
+
+@result{} ((l1 l2 l3) (r1 r2 r3))
+@end example
+
+@example
+(match '((1 . (a . b)) (2 . (c . d)) (3 . (e . f)))
+  [((key . (left . right)) ...)
+   (fold-right acons '() key right )])
+
+@result{} ((1 . b) (2 . d) (3 . f))
+@end example
+
+@example
+(match '(((a b c) e f g) 1 2 3)
+  [(((head ...) . rest) tails ...)
+   (acons tails head rest )])
+
+@result {} (((1 2 3) a b c) e f g)
+@end example
+
+Patterns can represent any Scheme object: lists, strings, symbols,
+records, etc.
+
+@noindent When a matching pattern is found, an expression is evaluated
+with pattern variables bound to the corresponding elements of the object.
+
+@example
+(let re ([m #(a "b" c "d" e "f" g)])
+   (match m
+     [(or (e) #(e)) e]
+     [(or #(e1 e2 es ...)
+          (e1 e2 es ...))
+      (cons (cons e1 e2)
+           (re es))]))
+
+@result{} ((a . "b") (c . "d") (e . "f") . g)
+@end example
+
+@example
+(let re ([m '(a b c d e f g h i)])
+   (match m
+     [(e) e]
+     [(e1 e2 es ...)
+      (acons e1 e2 (re es))]))
+
+@result{} ((a . b) (c . d) (e . f) (g . h) . i)
+@end example
+
+@noindent Compared to hand-written code, pattern matching noticeably
+improves clarity and conciseness---no need to resort to series of
+@code{car} and @code{cdr} calls when matching lists, for instance.
+It also improves robustness, by making sure the input @emph{completely}
+matches the pattern---conversely, hand-written code often trades
+robustness for conciseness.  And of course, @code{match} is a macro,
+and the code it expands to is just as efficient as equivalent
+hand-written code.
+
+@noindent We define @code{match} as follows: @*


Why did you change this from

     The pattern matcher is defined as follows:

? While the 'we' / 'our' / ... construct is pretty convenient, IMO it isbetter avoided as long as the avoidance doesn't lead to awkwardconstructions.

  @deffn {Scheme Syntax} match exp clause1 clause2 @dots{}
  Match object @var{exp} against the patterns in @var{clause1}
@@ -96,9 +167,9 @@ value produced by the first matching clause.  If no clause 
matches,
  throw an exception with key @code{match-error}.

Each clause has the form @code{(pattern body1 body2 @dots{})}. Each

-@var{pattern} must follow the syntax described below.  Each body is an
+@code{pattern} must follow the syntax described below.  Each body is an
  arbitrary Scheme expression, possibly referring to pattern variables of
-@var{pattern}.
+@code{pattern}.
  @end deffn

@c FIXME: Document other forms:

@@ -114,7 +185,7 @@ arbitrary Scheme expression, possibly referring to pattern 
variables of
  @c
  @c clause ::= (pat body) | (pat => exp)

-The syntax and interpretation of patterns is as follows:

+@noindent @* The pattern language is specified as follows: @*

The stuff below still defines the interpretation, not only thelanguage/grammar. The change 'syntax -> language' seems fine to me, butwhy remove 'interpretation'?

Additionally, I personally would go for interpretation->semantics, butmaybe that's too obscure for a general audience.


> [...]>   @deffn {Scheme Syntax} match-lambda* clause1 clause2 @dots{}

@@ -264,11 +335,10 @@ and can also be used for recursive functions which match 
on their
  arguments as in @code{match-lambda*}.

@example

-(match-let (((x y) (list 1 2))
-            ((a b) (list 3 4)))
-  (list a b x y))
-@result{}
-(3 4 1 2)
+(match-let ([(x y ...) (list 1 2 3)]
+            [(a b ...) (list 3 4 5)])
+  (list x a y b))
+@result{} (1 3 (2 3) (4 5))
  @end example
  @end deffn

@@ -287,22 +357,34 @@ Similar to @code{match-let}, but analogously to @code{let*}, match and

  bind the variables in sequence, with preceding match variables in scope.

@example

-(match-let* (((x y) (list 1 2))
-             ((a b) (list x 4)))
-  (list a b x y))
+(match-let* ([(x . y) (list 1 2 3)]
+             [(a . b) (list x 4 y)])
+  (list a b))
  @equiv{}

The old example was simpler and still fully demonstrated 'match-let*',why the change (besides [])?

[...]

+@example

+(define wrap '(((((unnest arbitrary nestings))))))
+
+(let unwrap ([peel wrap])
+  (match-let* ([([core ...]) peel]
+              [(wrapper ...) core])
+    (if (> (length wrapper) 1)
+       wrapper
+       (unwrap wrapper))))
+
+@result{} (unnest arbitrary nestings)
+@end example
+


(Not saying anything about this example TBC.)

Greetings,
Maxime.

OpenPGP_0x49E3EE22191725EE.asc
Description: OpenPGP public key

OpenPGP_signature
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v3] docs/match: pattern matcher example makeover, Blake Shaw, 2023/02/01
- Re: [PATCH v3] docs/match: pattern matcher example makeover, Maxime Devos <=
  - Re: [PATCH v3] docs/match: pattern matcher example makeover, David Pirotte, 2023/02/02
    - Re: [PATCH v3] docs/match: pattern matcher example makeover, Blake Shaw, 2023/02/03
    - Re: [PATCH v3] docs/match: pattern matcher example makeover, Arun Isaac, 2023/02/03
    - Re: [PATCH v3] docs/match: pattern matcher example makeover, Blake Shaw, 2023/02/03
    - Re: [PATCH v3] docs/match: pattern matcher example makeover, Josselin Poiret, 2023/02/03
    - Re: [PATCH v3] docs/match: pattern matcher example makeover, Blake Shaw, 2023/02/03

Prev by Date: [PATCH v3] docs/match: pattern matcher example makeover
Next by Date: Re: Add internal definitions to derived forms
Previous by thread: [PATCH v3] docs/match: pattern matcher example makeover
Next by thread: Re: [PATCH v3] docs/match: pattern matcher example makeover
Index(es):
- Date
- Thread