emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mail-extract-address-components extract modified full name


From: Simon Josefsson
Subject: Re: mail-extract-address-components extract modified full name
Date: Tue, 27 Jul 2004 11:29:14 +0200
User-agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.3.50 (gnu/linux)

Katsumi Yamaoka <address@hidden> writes:

>>>>>> In <address@hidden> Katsumi Yamaoka wrote:
>
>> mail-header-parse-address doesn't support non-ASCII characters,
>> so a function like gnus-extract-address-components is also
>> needed.
>
> In my opinion, we need a much simpler function to replace the
> present mail-extract-address-components.

Good idea.  I have disliked mail-extr* for a long time.  It is both
complicated and not standards compliant.  If we are not going for
standards compliance, it should be possible to do something simple,
like the approach you propose.  XEmacs users have reported even
Latin-2 problems with the current implementation (Emacs do not have
those problems, though, but it suggest the implementation could be
improved).

> The required feature is only to parse the following four patterns:
>
> ADRESS
> <ADRESS>
> ADRESS (NAME)
> NAME <ADDRESS>
>
> They may be combined separated with commas of course.  The NAME
> portion may contain non-ASCII characters.  The NAME and ADDRESS
> portions should never be modified.

Quotation is what makes things complicated.  Consider:

"foo \"baz bar" <address@hidden>
"foo <address@hidden> bar" <address@hidden>
"foo '"<address@hidden>'" bar" <address@hidden>

The standard permit even more weirder things, though, but we probably
don't have to support those.

Read me right, this is not critique of your idea, just something to
keep in mind.

If you, or someone else, would like to implement the above idea, I
will try to assist and write a self test suite of it.  Then we can
detect regression problems in the future.  It is a big problem with
mail-extr* that you don't know how a small change might affect
practical use.

Btw, I assume you are familiar with g-e-a-c.  It is rather simple,
perhaps too simple.

(defun gnus-extract-address-components (from)
  "Extract address components from a From header.
Given an RFC-822 address FROM, extract full name and canonical address.
Returns a list of the form (FULL-NAME CANONICAL-ADDRESS).  Much more simple
solution than `mail-extract-address-components', which works much better, but
is slower."
  (let (name address)
    ;; First find the address - the thing with the @ in it.  This may
    ;; not be accurate in mail addresses, but does the trick most of
    ;; the time in news messages.
    (when (string-match "\\b[^@ \t<>address@hidden@ \t<>]+\\b" from)
      (setq address (substring from (match-beginning 0) (match-end 0))))
    ;; Then we check whether the "name <address>" format is used.
    (and address
         ;; Linear white space is not required.
         (string-match (concat "[ \t]*<" (regexp-quote address) ">") from)
         (and (setq name (substring from 0 (match-beginning 0)))
              ;; Strip any quotes from the name.
              (string-match "^\".*\"$" name)
              (setq name (substring name 1 (1- (match-end 0))))))
    ;; If not, then "address (name)" is used.
    (or name
        (and (string-match "(.+)" from)
             (setq name (substring from (1+ (match-beginning 0))
                                   (1- (match-end 0)))))
        (and (string-match "()" from)
             (setq name address))
        ;; XOVER might not support folded From headers.
        (and (string-match "(.*" from)
             (setq name (substring from (1+ (match-beginning 0))
                                   (match-end 0)))))
    (list (if (string= name "") nil name) (or address from))))





reply via email to

[Prev in Thread] Current Thread [Next in Thread]