|
From: | Robert Mitchell |
Subject: | Re: [Classpath-inetlib] Problems in gnu.inet.util.LineInputStream and gnu.inet.util.CRLFInputStream |
Date: | Wed, 06 Apr 2005 14:54:53 -0500 |
It looks to be working. The test case I was using involved a very long email message (something over 70,000 bytes and probably over 1000 lines) which I needed using "getInputStream" as part of my processing. Because of the way getInputStream works, it was reading the entire message and re-parsing the headers to find the body. The inefficient CRLFInputStream was probably doing about 1000 * 70000 / 2 (35 million) byte copies to just read the first line and close to that for each of the approximately 6 lines in the header. The result was that it was taking close to a minute to parse the header. The new code is much more efficient.
Thanks,
Bob Mitchell
>>> Chris Burdess <address@hidden> 4/6/2005 2:10 PM >>> Robert Mitchell wrote:
> One way around the problems with mbox, etc. is to filter them to add > the CR to the end of line sequence. That's true, although our recent discussion shows some of the inherent difficulties and inefficiencies processing multi-character delimiters, therefore I feel that normalisation to a single delimiter prior to processing should yield better results. > All that aside, I do not think it would be worth changing the > architecture unless the current implementation is considered > incompatible with the JavaMail specification. I think this is an area > where the specification is incomplete, although you might argue that > the references to the Internet mail RFC's requires CRLF endings for > javax.mail.internet implementations at a minimum. That should be the case for e.g. InternetHeaders parsing: if there are other cases where CRLFs are not normalised, please let us know. I have submitted a new version of CRLFInputStream now, and tested it with the special case where the first CR is at the end of the buffer. This doesn't seem to result in a significant performance change from the previous version in my tests, if you have different results I'd be interested in seeing them. -- Chris Burdess |
[Prev in Thread] | Current Thread | [Next in Thread] |