emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Extending the ecomplete.el data store.


From: Karl Fogel
Subject: Re: Extending the ecomplete.el data store.
Date: Sun, 04 Feb 2018 17:54:13 -0600
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Stefan Monnier <address@hidden> writes:
>FWIW, I recently installed a completion-table for ecomplete together
>with a ecomplete completion-at-point-function for message.el, which
>together let you use the ecomplete database for TAB completion as well
>as for company-mode ("tooltip-like").

That's good to know, thanks.  All the more reason to centralize on a unified 
database of email addresses, containing all the information anyone might want, 
and have packages build functionality around that.

>>   (KEY          ; string: downcased email addr
>>     ((VARIANT   ; string: case-preserving address w/ real name
>>        (TYPE                                     ; symbol: `mail', etc
>>          ('last-sent  LAST_TIME_SENT_TO)          ; int: seconds since epoch
>>          ('last-recv  LAST_TIME_RECEIVED_FROM_TO) ; int: seconds since epoch
>>          ('sent-count SENT_COUNT)                 ; int: total times sent
>>          ('recv-count RECEIVED_COUNT)             ; int: total times received
>>        )
>>        ...further TYPEs could go here...
>>      )
>>      ...further VARIANTs here...
>>     )
>>     ...[reserved, in case we ever need something other than VARIANTs]...
>>   )
>
>Can you show an example where the presence of multiple variants lets you
>do something you can't do with the single variant?
>I'm not sure I understand the benefits.

Sure.  It's common to have these kinds of variants for one email address (note 
how subtle case variations can even appear in only the address portion):

  "Wutherington, Joanna - NYC" <address@hidden>
  "Wutherington, Joanna" <address@hidden>
  "JOANNA WUTHERINGTON" <address@hidden>
  "Joanna Wutherington" <address@hidden>
  "Joanna Wutherington" <address@hidden>
  "joanna wutherington" <address@hidden>
  "J. Wutherington" <address@hidden>
  "Joanna W." <address@hidden>
  "Joanna W" <address@hidden>
  "address@hidden" <address@hidden>
  ... etc, etc ...

(I've seen all of those variants before, and some addresses show up in my 
completion database with a significant number of those variants.  Oh, and 
sometimes they have double quotes and sometimes they don't.  Fun.)

There are many factors that can cause this kind of variation.  For example, 
when one is sending mail to that recipient, one might compose the email this 
way...

  "Joanna Wutherington" <address@hidden>

...even though one has never actually received mail from them with that exact 
form of the address.  Maybe one copied-and-pasted the address from other 
sources, or whatever.  The point is, that might be the route by which that 
particular form gets into the completion database.

So the question is, when completing an address, which variant does the user 
want?

Mailaprop tries to figure out the "best" variant of a given address, and assign 
that variant a higher score than any of the other variants, so that the "best" 
one shows up higher in the completion list than any of those others.

The algorithm Mailaprop uses for determining "best" is not important here.  The 
point is just that in order to have an algorithm at all, the inputs have to be 
available.  Thus, the reason to have a format that preserves all these variants 
is so that packages (like ecomplete and mailaprop) can have enough information 
to try out interesting algorithms for autofill behavior.

As far as I know, ecomplete just always remembers the most-recently-seen 
variant.  That probably works well for most cases, but there will be times when

  "Joanna Wutherington" <address@hidden>

gets replaced by (say)

  "address@hidden" <address@hidden>

...yet the user would almost certainly prefer the former.  Mailaprop preserves 
all the variants and scores them in order to avoid that situation.

>AFAICT the main difference here compared to the ecompleterc format is
>that we impose the notion of "sending" and "receiving", whereas the
>ecompleterc could conceivably be used for things fundamentally unrelated
>to sending/receiving messages (e.g. completion of file names, say).

Ah, yes -- so could mailaprop, come to think of it.  However, the TYPE 
indicator can govern what's in the variant list.  In other words, we can adjust 
the proposal so that the inner format is just for when TYPE == `mail'.  The 
inner format for other types has yet to be determined, because we don't know 
yet what kind of inputs they'll need to make good autofill behavior possible.

>I haven't used ecomplete very much so far, but I've noticed some issues
>which I think are linked to having multiple Emacs sessions use it at the
>same time.  I haven't investigated enough to be sure, but in any case
>it's a use case that should be kept in mind.

That's probably related only to how ecomplete generates its database; I don't 
think it affects the format of the database.  I.e., if one wants to be able to 
"splice new things in" to the database in memory, and write the database out at 
the end of the session, that's not significantly harder with the proposed new 
format than with the old one, and any multiple-session or cross-machine 
synchronization/conflict problems are the same in both cases.

>[ And along vaguely related lines, I'd really like if the ecompleterc
>  database could be somehow shared between my different machines.
>  E.g. by arranging for git-merge to "do-the-right-thing" on it, or by
>  storing (a copy of) it in IMAP.  ]

I haven't thought much about that, because I solve that problem out-of-band 
right now: my mailaprop database is under version control and gets 
automatically sync'd across all the machines I work on (and the same would be 
true of .ecompleterc if I were using that).  I agree it would be a good thing 
if Emacs solved that automagically, as long as it were truly reliable.

Best regards,
-Karl



reply via email to

[Prev in Thread] Current Thread [Next in Thread]