[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regexp-split for Guile

From: Mark H Weaver
Subject: Re: regexp-split for Guile
Date: Sat, 20 Oct 2012 10:16:49 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

I wrote:
>     (regexp-split " +" "  foo  bar  baz  " #:limit 3 #:trim 'both)
>       => ("foo" "bar" "baz")
>     (regexp-split " +" "  foo  bar  baz  " #:limit 2 #:trim 'both)
>       => ("foo" "bar")

Sorry, that last example is wrong of course, but both of these examples
raise an interesting question about how #:limit and #:trim should
interact.  To my mind, the top example above is correct.  I think the
last result should be "baz", not "baz  ".

I guess I'd prefer to think of #:trim as trimming *before* splitting,
instead of trimming empty elements *after* splitting, so:

     (regexp-split " +" "  foo  bar  baz  " #:limit 3 #:trim 'both)
       => ("foo" "bar" "baz")
     (regexp-split " +" "  foo  bar  baz  " #:limit 2 #:trim 'both)
       => ("foo" "bar  baz")

Note also that if you trim empty elements *after* splitting, then
there's a bad interaction with #:limit if you trim the left side.

     (regexp-split " +" "  foo  bar  baz  " #:limit 3 #:trim 'both)

If we first split, taking into account the limit, we get:

     ("" "foo" "bar  baz  ")

and then we trim empty elements from both ends to get the final result:

       => ("foo" "bar  baz")

which seems wrong, given that I asked for #:limit 3.

Honestly, this question makes me wonder if the proposed 'regexp-split'
is too complicated.  If you want to trim whitespace, how about using
'string-trim-right' or 'string-trim-both' before splitting?  It seems
more likely to do what I would expect.

What do you think?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]