nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] The curse of m_getfld()


From: Paul Vixie
Subject: Re: [Nmh-workers] The curse of m_getfld()
Date: Thu, 26 Jan 2012 20:03:28 +0000
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1

On 1/26/2012 6:06 PM, Ken Hornstein wrote:
>> my thought is, fire photon torpedoes. m_getfld was the wrong approach
>> when it was new but it worked well enough (especially on slower older
>> machines). we should not compromise on readability in order to keep the
>> couple hundred places that call m_getfld from having to change.
> By my count it's actually only 77 calls to m_getfld().  So it's not
> insurmountable by any means.

looks like i'm not off by even one order of magnitude: close enough for
software work :-).

> I was thinking that m_getfld() would actually have a couple arguments
> added to it in terms of "here's the mime type of whatever data you
> have".  Internally there would be a lot of rototilling.

imagine me backing toward the door, making the sign of the cross at you
as i do so.

> But okay,
> since you've thought about this more than I have ... what is _your_
> vision of an API?  

i've just looked at m_getfld.c, for the first time since 1994 or so.
this was good code in the pdp11 era.

this is a stateful iterator which does character level processing from
an underlying stdio FILE object. its caller must know internal details
of the state machine. opaqueness is nowhere attempted. it digs into the
underlying FILE object to effect multi-character "ungetc" which is not
supported by POSIX stdio, and it also returns pointers into the
underlying FILE object's buffer to avoid character copying, all with
#ifdef's for LINUX_STDIO which presumably works differently. it tries
hard to give the compiler hints to use the vax MATCH3 instruction for
substring searching. its API and its implementation make UTF-8
impossible and by the time this thing has returned it is not possible
for the caller to perform any I18N processing on fields like "subject"
that can have same. its indentation has several off-by-one shifts.

those of you who knew me in 1990 know that i used to write code like
this; it was an art; m_getfld.c is high art.

> If you'd be willing to sketch something out that sure would be great.


my sketch would look a lot like "man 3 db" so please start there. the
idea would be to call some function that returned a heap object that had
some function pointers (for example, iterating through headers or mime
components) and had some private data (like pointers to the underlying
storage media, such as a POSIX "FILE" object. it would be high level
enough that if someone wanted to reimplement it on top of an imap
connection or a postgres connection then the callers would not have to
know about it. the implementation would beat the holy crap out of the
heap, returning pointerballs that were created with strdup and asprintf.
the caller would be obligated to call some kind of "free" like function
after consuming data from each iteration, and another "destroy" function
for the message object itself. we would feel free to call "strstr",
"strchr", "strcasecmp", etc. we would not use any "register" variables.

once we had this for message objects we'd clean it up from the lessons
learned and then use the same approach for folder objects and mailstore
objects. folder objects are hard, since MH message numbers are more
permanent than IMAP message numbers but less permanent than IMAP UID's.
but the new higher-level "folder" object's interface would be defined
with an eye toward supporting non-MH mail stores (such as IMAP or SQL).

when complete, the only part of the MH code base that knew about
opendir() or fopen() would be behind an API.

note that others (bernstein for example, and horton) have attempted
this. some user interfaces do include an abstract API to make them mail
store independent. i observe that: none of these API's has taken the
world by storm; none of them support the "middle level" of message
number permanent required by (and loved by!) MH; none of them are even
as modern as "db(3)" in their presentation of the object level
interface. so, i think this field is still green for us.

i'd be happy to go further with this if my vision is attractive to
folks. i am not willing to do all of the world or even the lion's share
of the work myself -- nor will i do anything more at all if the "core
team" doesn't give a thumbs up.

paul



reply via email to

[Prev in Thread] Current Thread [Next in Thread]