spamass-milt-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: subject is deleted instead of rewritten


From: Ronan Waide
Subject: Re: subject is deleted instead of rewritten
Date: Thu, 9 Jan 2003 23:43:00 +0000

On January 9, address@hidden said:
> Arg. I don't want to have to write an RFC822 parser. :)  Too bad we
> can't just inject the output of spamc back into sendmail somehow and
> have it parse the thing for us.

Oh, it doesn't have to be a full parser. The only rules you need are:

* The header goes from the start of the message to the first
  completely empty line
* A header is some us-ascii characters at the start of a line followed
  by a colon
* Header contents are everything between the colon - optionally less a
  leading space - and the start of the next header.

and optionally

* If there's more than one header with the same name, I dunno. Merge
  the headers? I've not read the spec on this issue.

I was going to play with Perl's Sendmail::Milter, but that wants me to
have a threaded Perl built before it'll install. Dammit.

Anyway. Here's some reworked code (now you have three versions: mine,
Niki's, and yours!) for the retrieve_field function with lots of
comments and some slightly more obvious variable naming. Note the
final comments on this code: the return value really should be cleaned
up before being passed back. This is why I'd prefer the other solution
I suggested, which is to build a hash of headers when you're receiving
the message from sendmail.

Cheers,
Waider.

// retrieve the content of a specific field in the header
// and return it.
string
retrieve_field(const string& header, const string& field)
{
  // Find the field
  string::size_type field_start = string::npos;
  string::size_type field_end = string::npos;
  string::size_type idx = 0;

  while( field_start == string::npos ) {
        idx = find_nocase( header, field + string(":"), idx );

        // no match
        if ( idx == string::npos ) {
          debug( 3, "r_f: field not found" );
          return string( "" );
        }

        // The string we've found needs to be either at the start of the
        // headers string, or immediately following a "\n"
        if ( idx != 0 ) {
          if ( header[ idx - 1 ] != "\n" ) {
                idx++; // so we don't get stuck in an infinite loop
                continue; // loop around again
          }
        }
  }

  // A mail field starts just after the header. Ideally, there's a
  // space, but it's possible that there isn't.
  field_start += field.length() + 1;
  if ( field_start < ( header.length() - 1 ) && header[ field_start ] == " " ) {
        field_start++;
  }

  // See if there's anything left, to shortcut the rest of the
  // function.
  if ( field_start == header.length() - 1 ) {
        return string( "" );
  }

  // The field continues to the end of line. If the start of the next
  // line is whitespace, then the field continues.
  idx = field_start;
  while( field_end == string.npos ) {
        idx = find( header, "\n", idx );

        // if we don't find a "\n", or it's at the end of the headers then
        // gobble everything to the end of the headers
        if (( idx == string.npos ) || ( idx == header.length() - 1 )) {
          field_end = header.length() - 1;
        } else {
          // check the next character
          if (( idx + 1 ) < header.length() && ( isspace( header[ idx + 1 ] ))) 
{
                idx ++; // whitespace found, so wrap to the next line
          } else {
                field_end = idx;
          }
        }
  }

  // possible cleanups:
  //  trim the trailing \n
  //  remove the whitespace picked up when a header wraps - this is
  //    actually a requirement rather than a possible cleanup.
  return header.substr( field_start, field_end - field_start );
};

-- 
address@hidden / Yes, it /is/ very personal of me.

"I hadn't counted on falling in love this early in the plot." - dhalgren




reply via email to

[Prev in Thread] Current Thread [Next in Thread]