[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-mailutils] Parsing 'Received:' header into components
From: |
Robby Villegas |
Subject: |
[bug-mailutils] Parsing 'Received:' header into components |
Date: |
Sat, 19 Mar 2005 01:20:26 -0600 |
I want to (try valiantly to) decompose the value of a Received line
into useful components. I was about to embark on writing a parser
using the functions in parse822.h, but I figure someone else must have
attacked the Received line and written the code. I don't want to
reinvent this wheel!
Take the following Received header as an example:
Received: from somewhere.com (mailhub.somewhere.com [10.20.30.40])
by home1.somewhere.com (8.12.11/8.12.11) with ESMTP id j2CKbxYX012911
for <address@hidden>; Sat, 12 Mar 2005 14:37:59 -0600
I think my top level groups will be prefixed by these five words:
{from, by, via, with, for}
since they seem to give overall routing information.
The sixth element will be date/time.
Within each group, I want to parse the usual pieces as much as I can.
To illustrate, here are three of those group broken into nested lists:
with ESMTP id j2CKbxYX012911 ->
{ESMTP, id, j2CKbxYX012911}
Sat, 12 Mar 2005 14:37:59 -0600 ->
{ {Sat, 12, Mar, 2005}, {{14, 37, 59}, -0600} }
from somewhere.com (mailhub.somewhere.com [10.20.30.40]) ->
{ from, {somewhere, com}, {{mailhub, somewhere, com}, {10, 20, 30, 40}} }
I gather from a smidgen of reading and some spam I've looked at, that
parsing this field is troublesome. The structure may not be as
regular as I outlined above. Especially when it comes from spammers.
But if I can destructure 95% of cases with a small number of
heuristics, that would be very useful.
Do you know of existing code to do this, or should I embark on it?
Thanks,
Robby Villegas
P.S. Here's a less standard piece in a comment which is interesting:
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); ->
{{version, TLSv1/SSLv3}, {cipher, DHE-RSA-AES256-SHA}, {bits, 256},
{verify, NOT}}
- [bug-mailutils] Parsing 'Received:' header into components,
Robby Villegas <=