Index: ChangeLog =================================================================== RCS file: /cvs/mailutils/ChangeLog,v retrieving revision 1.80 diff -u -r1.80 ChangeLog --- ChangeLog 2001/04/10 05:18:47 1.80 +++ ChangeLog 2001/04/13 21:11:11 @@ -1,3 +1,22 @@ +2001-04-13 Sam Roberts + + * doc/address.texi: updated docs, now they match the parse822. + * mailbox/parse822.c: small tweaks to the new parser, the changes + made during the tidying over the last month were: + - use C comments only. + - don't use C++ reserved words. + - fix is_digit() to be like the other is functions + - Changed return codes to: + . no mem (ENOMEM) + . function wasn't called correctly, usually a missing + argument (EINVAL) + . invalid syntax found during parsing (ENOENT) + . success == 0 + - const-corrected the APIs + - removed unnecessary (in C) casts. + - mailbox_t* removed in favor of address_t. + - fix handful of memory leaks detected by Alain. + 2001-04-10 Alain Magloire * pop3d/retr.c (pop3_retr): Typo. Index: doc/address.texi =================================================================== RCS file: /cvs/mailutils/doc/address.texi,v retrieving revision 1.2 diff -u -r1.2 address.texi --- doc/address.texi 2001/03/09 05:45:48 1.2 +++ doc/address.texi 2001/04/13 21:11:12 @@ -1,12 +1,28 @@ @code{#include } address@hidden {Data Type} address_t -The @code{address_t} object is used to hold information about a parsed -RFC822 address list, and is an opaque -data structure to the user. Functions are provided to retrieve information -about a address in the address list. + +The internet address format is defined in RFC 822. RFC 822 is in the +process of being updated, and will soon be superceeded by a new RFC +that makes some corrections and clarifications. References to RFC 822 +here apply equally to the new RFC. + +The RFC 822 format is more flexible than many people realize, here +is a quick summary of the syntax this parser implements, see +RFC 822 for the details. "[]" pairs mean "optional", "/" means "one or +the other", and double-quoted characters are literals. + address@hidden +address-list = address ["," address-list] +address = mailbox / group +mailbox = addr-spec ["(" personal ")"] / + [personal] "<" [route] addr-spec ">" +addr-spec = local-part "@" domain +group = phrase ":" mailbox-list ";" + +mailbox-list = mailbox ["," mailbox-list] address@hidden example -Several address functions have a set of common arguments, which are described -here to avoid repetition. +Several address functions have a set of common arguments with consistent +semantics, these are described here to avoid repetition. Since an address-list may contain multiple addresses, they are accessed by a @strong{one-based} index number, @var{no}. The index is one-based @@ -22,37 +38,43 @@ In this case, if @var{n} is provided address@hidden is assigned the length of the component string. -Comments: address@hidden ADDRESSENOMEM address@hidden ENOMEM +Not enough memory to allocate resources. address@hidden macro -What happens if @var{no} is past the end of the list? I think you -just get zero-length output, but it should be an error, maybe EINVAL. address@hidden ADDRESSEPARSE address@hidden ENOENT +Invalid RFC822 syntax, parsing failed. address@hidden macro + address@hidden ADDRESSENOENT address@hidden ENOENT +The index @var{no} is outside of the range of available addresses. address@hidden macro -Musings: address@hidden ADDRESSEINVAL address@hidden EINVAL +Invalid usage, usually a required argument was @code{nul}. address@hidden macro -Two problems are domain literals, and non-ascii characters. I -think domain literals should be parsed from [127.0.0.1] into -the more commonly groked 127.0.0.1 when somebody does a get_domain() -on one, and that if somebody wants to provide one as a domain, they -can. - -Non ascii chars are uglier, perhaps a warning? Perhaps a non-strict -switch that makes it an error, and otherwise if they want umlauts -and utf-8 in the strings, just allow it? The machinery to encode -and decode header fields according to the MIME spec doesn't really -belong here, it's a layer on top of rfc822. address@hidden {Data Type} address_t +The @code{address_t} object is used to hold information about a parsed +RFC822 address list, and is an opaque +data structure to the user. Functions are provided to retrieve information +about an address in the address list. @end deftp @deftypefun int address_create (address_t address@hidden, const char address@hidden) This function allocates and initializes @var{addr} by parsing the -RFC822 address-list @var{string}. Parsing is best effort, if the address@hidden isn't a valid RFC822 syntax list of addresses, then -the results are undefined. +RFC822 address-list @var{string}. The return value is @code{0} on success and a code number on error conditions: @table @code address@hidden ENOMEM -Not enough memory to allocate resources. address@hidden address@hidden address@hidden @end table @end deftypefun @@ -69,20 +91,23 @@ The return value is @code{0} on success and a code number on error conditions: @table @code address@hidden EINVAL address@hidden is NULL. address@hidden address@hidden @end table @end deftypefun @deftypefun int address_get_personal (address_t address@hidden, size_t @var{no}, char* @var{buf}, size_t @var{len}, size_t* @var{n}) Acesses the personal phrase describing the @var{no}th email address. This -phrase is optional, so may not be present. +personal is optional, so may not be present. If it is not present, but +there is an RFC822 comment after the address, that comment will be +returned as the personal phrase, as this is a common usage of the comment +even though it is not defined in the internet mail standard. The return value is @code{0} on success and a code number on error conditions: @table @code address@hidden EINVAL address@hidden is NULL. address@hidden address@hidden @end table @end deftypefun @@ -90,42 +115,89 @@ @deftypefun int address_get_comments (address_t address@hidden, size_t @var{no}, char* @var{buf}, size_t @var{len}, size_t* @var{n}) Acesses the comments extracted while parsing the @var{no}th email address. -These comments have no defined meaning, but in the absence of the personal -descriptive phrase, may describe the email address. +These comments have no defined meaning, and are not currently collected. The return value is @code{0} on success and a code number on error conditions: @table @code address@hidden EINVAL address@hidden is NULL. address@hidden address@hidden @end table @end deftypefun address@hidden int address_to_string (address_t address@hidden, char* @var{buf}, size_t @var{len}, size_t* @var{n}) address@hidden int address_get_email (address_t address@hidden, size_t @var{no}, char* @var{buf}, size_t @var{len}, size_t* @var{n}) -Returns the entire address list as a single RFC822 formatted address -list. +Acesses the email addr-spec extracted while +parsing the @var{no}th email address. The return value is @code{0} on success and a code number on error conditions: @table @code address@hidden EINVAL address@hidden is NULL. address@hidden address@hidden address@hidden table address@hidden deftypefun + address@hidden int address_get_local_part (address_t address@hidden, size_t @var{no}, char* @var{buf}, size_t @var{len}, size_t* @var{n}) + +Acesses the local-part of an email addr-spec extracted while +parsing the @var{no}th email address. + +The return value is @code{0} on success and a code number on error conditions: address@hidden @code address@hidden address@hidden @end table @end deftypefun address@hidden int address_get_domain (address_t address@hidden, size_t @var{no}, char* @var{buf}, size_t @var{len}, size_t* @var{n}) + +Acesses the domain of an email addr-spec extracted while +parsing the @var{no}th email address. address@hidden int address_get_count (address_t address@hidden, size_t* @var{no}) +The return value is @code{0} on success and a code number on error conditions: address@hidden @code address@hidden address@hidden address@hidden table address@hidden deftypefun -Returns the number of addresses in the address list. address@hidden int address_get_route (address_t address@hidden, size_t @var{no}, char* @var{buf}, size_t @var{len}, size_t* @var{n}) +Acesses the route of an email addr-spec extracted while +parsing the @var{no}th email address. This is a rarely used RFC822 address +syntax, but is legal in SMTP as well. The entire route is returned as +a string, those wishing to parse it should look at . + The return value is @code{0} on success and a code number on error conditions: @table @code address@hidden EINVAL address@hidden is NULL. address@hidden address@hidden @end table @end deftypefun address@hidden int address_to_string (address_t address@hidden, char* @var{buf}, size_t @var{len}, size_t* @var{n}) +Returns the entire address list as a single RFC822 formatted address +list. + +The return value is @code{0} on success and a code number on error conditions: address@hidden @code address@hidden address@hidden address@hidden table address@hidden deftypefun + + address@hidden int address_get_count (address_t @var{addr}, size_t* @var{count}) + +Returns a count of the addresses in the address list. + +If @var{addr} is @code{nul}, the count is @code{0}. If @var{count} is +not @code{nul}, the count will be written to address@hidden + +The return value is @code{0}. address@hidden deftypefun + @section Example @example #include @@ -158,17 +230,17 @@ printf(" personal '%s'\n", buf); - address_get_comments(address, no, buf, sizeof(buf), 0); + address_get_local_part(address, no, buf, sizeof(buf), 0); - printf(" comments '%s'\n", buf); + printf(" local_part '%s'\n", buf); - address_get_email(address, no, buf, sizeof(buf), 0); + address_get_domain(address, no, buf, sizeof(buf), 0); - printf(" email '%s'\n", buf); + printf(" domain '%s'\n", buf); - address_to_string(address, buf, sizeof(buf), 0); + address_get_email(address, no, buf, sizeof(buf), 0); - printf(" to_string '%s'\n", buf); + printf(" email '%s'\n", buf); @} @} Index: mailbox/parse822.c =================================================================== RCS file: /cvs/mailutils/mailbox/parse822.c,v retrieving revision 1.3 diff -u -r1.3 parse822.c --- mailbox/parse822.c 2001/04/10 05:13:51 1.3 +++ mailbox/parse822.c 2001/04/13 21:11:14 @@ -15,39 +15,28 @@ along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ -/* vi:sw=4:ts=8 */ /* -Things to (maybe) do: +Things to consider: - - groups used to return the number of addresses, now it returns - success... but doesn't create an _address for 'foo:;'. Should - it create one with just a personal? + - A group should create an address node with a group, accessable + with address_get_group(). -x - C comments only. -x - no C++ reserved words. -x - fix is_digit() to be like the other is functions + - Make domain optional in addr-spec, for parsing address lists + provided to local mail utilities. - - what should return codes be, possible errors are: - . no mem (ENOMEM) - . function wasn't called correctly, usually a missing argument (EINVAL) - . invalid syntax found during parsing (ENOENT) + - Should we do best effort parsing, so parsing "address@hidden, foo@" + gets one address, or just say it is or it isn't in RFC format? + Right now we're strict, we'll see how it goes. - All functions should return ==0 on success, and errno on failure. + - quote local-part when generating email field of address_t. -x - const-correct the APIs -x - "new = (char*) realloc()", cast not needed -x - mailbox_t* nuked in favor of address_t -x - fix handful of memory leaks detected by Alain + - parse field names and bodies? + - parse dates? + - parse Received: field? - - test for memory leaks, so I don't have to rely on Alains sharp eyes + - test for memory leaks on malloc failure - fix the realloc, try a struct _string { char* b, size_t sz }; -x - where does parse822.h go? - - parse field names and bodies - - parse dates (pull from Mail++) - - parse Received: field -x - check RFC again, can groups be nested? No! - - should we do best effort parsing, so parsing "address@hidden, foo@" - gets one address, or just say it is or it isn't in RFC format? + */ #ifdef HAVE_CONFIG_H @@ -536,40 +525,6 @@ return rc; } -/* FIXME: Delete this one. adddress.c do the work now. */ -#if 0 -int address_create0 (address_t* a, const char* s) -{ - /* 'a' must exist, and can't already have been initialized - */ - - int status = 0; - - if(!a || *a) { - return EINVAL; - } - - status = parse822_address_list(a, (char*) s); - - if(status == EOK) { - if(!*a) { - /* there was a group that got parsed correctly, but had - * no addresses... - */ - return EPARSE; - } - (*a)->addr = strdup(s); - - if(!(*a)->addr) { - address_destroy(a); - return ENOMEM; - } - } - - return status; -} -#endif - int parse822_address_list(address_t* a, const char* s) { /* address-list = #(address) */ @@ -708,8 +663,6 @@ /* -> addr-spec */ if((rc = parse822_addr_spec(p, e, a)) == EOK) { - /*int rc = EOK; */ - parse822_skip_ws(p, e); /* yuck. */ @@ -733,7 +686,6 @@ /* -> phrase route-addr */ { char* phrase = 0; - /*int rc; */ rc = parse822_phrase(p, e, &phrase); @@ -1161,3 +1113,4 @@ return 1; } #endif +