[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Another issue with thingatpt

From: Piet van Oostrum
Subject: Re: Another issue with thingatpt
Date: Fri, 29 Dec 2006 22:23:55 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.92 (darwin)

>>>>> Bob Rogers <address@hidden> (BR) wrote:

>BR>    From: Werner LEMBERG <address@hidden>
>BR>    Date: Wed, 27 Dec 2006 11:50:42 +0100 (CET)

>BR>    Here's another problematic URL:

>BR>      http://mousai.kanji.zinbun.kyoto-u.ac.jp/ids-find?components=&U+20207;

>BR>    thingatpt ignores the final `;'.

>BR>        Werner

>BR> According to RFC3986 (aka STD066), this is wrong; ";" is legitimate
>BR> anywhere in a path or query part, including the end.  So are "." and
>BR> ",", but thing-at-point-url-path-regexp also refuses to match these
>BR> characters at the end of the string.  Doing (ffap-string-at-point 'url)
>BR> drops these characters plus ":", "!", and (questionably) "?".

>BR>    It may not be possible to find a tradeoff between RFC compliance and
>BR> parsing dwimmery that would satisfy everybody.  Since stripping off
>BR> trailing punctuation is useful behavior (ISTR it's worked this way for a
>BR> while now), I would recommend against changing it now.  However, a case
>BR> could be made for making thing-at-point and ffap-string-at-point
>BR> consistent.  Perhaps "!:;.," would be best?  This is just the union of
>BR> the two sets but without the dubious inclusion of "?".

The way to reconcile these would be to customize it, I think. For example
have a string variable that contains the punctuation characters to be
included at the end. Or a regexp.

By the way, thing-at-point-url-path-regexp also disallows : inside a url.
These would be necessary to accept IPv6 IP addresses.
Piet van Oostrum <address@hidden>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]