classpathx-javamail
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Classpathx-javamail] Re: [jonas-team] GNU Mail, AXIS and CR LF or why


From: Fernando Nasser
Subject: [Classpathx-javamail] Re: [jonas-team] GNU Mail, AXIS and CR LF or why (possibly) some multipart AXIS tests fail
Date: Tue, 31 May 2005 11:54:11 -0400
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113

For the record, this has been fixed. I sent Chris a patch and Chris has sent me an alternative one for testing. Both patches fix the problem.

I don't know if Chris has commited his patch yet.

Unfortunately although we now can send the message, the AXIS implementation of javax.xml.soap.MessageFactory.createMessage() still fails when using the Classpathx Mail mailapi.jar file.

It is a problem in obtaining the contents of the Parts in MimeMultipart messages (they come up as empty or missing a large chunck from the beginning). This is currently under investigation.

Regards to all,
Fernando




Fernando Nasser wrote:
Florent and I have been tracking down a problem that causes 10 tests to fail when the GNU Mail replaces the Sun Reference Implementation of JavaMail. All these failures are related to handling multipart messages and this is the single thing that is preventing us to confirm this change.

With a test program we could narrow down the problem.  Here are some facts:

1) Replacing the GNU Mail classes in client.jar with the Sun JavaMail classes eliminates the problem. I was able to determine that only the mailapi.jar is enough and Florent went even further, just replacing the GNU Mail classes that use GNU Inetlib ( a low level network library) with the corresponding classes from JavaMail (thus avoiding the use of Inetlib also solves the problem.

Here is how I am testing the fail / no-fail cases at the moment:

java -jar /usr/share/jonas/lib/client.jar output/clients/client.jar

is a 4.4 client.jar that contains GNU Mail (+ Inetlib)  ==>  FAIL

java -jar jonas-client.jar output/clients/client.jar

which is a 4.3.5 client.jar that contains Sun JavaMail RI.  ==>  PASS


Corollary: Sun RI JavaMail and GNU Mail must be sending different things through the HTTP messages. The Sun one is the (de facto) correct.



2) I found out that the error occurs because an AXIS routine fails to identify the proper boundaries of the multipart message and runs out of the InputStream (obtained from the HttpServletRequest). Here is the address of the failure:

org.apache.axis.attachments.MultiPartRelatedInputStream.<init>(Ljava.
lang.String;Ljava.io.InputStream;)V(MultiPartRelatedInputStream.java:336)

This constructor is called as a consequence of the use of javax.xml.soap.MessageFactory.createMessage( with arguments ); which is used to "internalize" the data received back into a SOAPMessage object. This is the textbook example of how this should be done.


If you look at this (very fragile) code you will notice one very important comment:

// after the first boundary each boundary will have a cr lf at the beginning since after the data in any part there
// is a cr lf added to put the boundary at the begining of a line.

And so it is coded, after the first marker is found, a CR LF is added to the start of the text ("boundary") being searched.


Corollary: If a boundary marker that is not the first one does not start with CR LF it will not compare "equal" and will not be found by AXIS MultiPartRelatedInputStream constructor.



3) Thanks to Vadim, I learned how to use Ethereal and could go back in time to some previous life when I used to do such things frequently.


After some time looking and after learning some features of the tool, I concluded that, except for the random numbers and the point where the packets were split, the messages sent by Sun JavaMail RI and GNU Mail were almost identical. A-L-M-O-S-T.


For some reason, perhaps some feature of Inetlib, perhaps because the wrong Inetlib function is being called, the message generated when GNU Mail is used does not have the LF on the headers of the parts, only CR.
As headers is an overloaded term in SOAP context, here is an example:


 From Sun JavaMail RI:

^M
------=_Part_0_69510262.1116534622553^M
Content-Type: text/xml; charset=UTF-8^M
Content-Transfer-Encoding: binary^M
Content-Id: <334C15FC1CBBA36BFF64CF55A4016EF2>^M
^M
<soapenv:Envelope soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"; xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"; xmlns:xsd="http://www.w3.org/2001/XMLSchema"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/";><soapenv:Body/></soapenv:Envelope>
^M
------=_Part_0_69510262.1116534622553^M
Content-Type: text/html^M
Content-Transfer-Encoding: binary^M
Content-Id: <D8DF3BE596E312ED354472AC67B554C2>^M
^M



<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd";>^M
<html>^M
(...)

Note that all "boundaries" (------=_Part_0_69510262.1116534622553) start on a new line, following a CR LF

Here is a fragment in in hex:

0040  81 34 f4 7f 0d 0a 2d 2d  2d 2d 2d 2d 3d 5f 50 61   .4....-- ----=_Pa

Note the "0d 0a" before the '------=_Pa'



In the GNU Mail case, the LFs are gone:

^M



------=_Part_0_70279660.1116534635530^M
Content-Type: text/xml; charset=UTF-8^MContent-Transfer-Encoding: binary^MContent-Id: <46B3E0073E572CF16B063980ECEA7587>^M^M<soapenv:Envelope soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"; xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"; xmlns:xsd="http://www.w3.org/2001/XMLSchema"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/";><soapenv:Body/></soapenv:Envelope>
^M------=_Part_0_70279660.1116534635530^M
Content-Type: text/html^MContent-Transfer-Encoding: binary^MContent-Id: <195BFE4772082B5EEAC9D1126DAA6DCE>^M^M<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd";>^M
<html>^M
(...)

If you look closely, you will notice that '------=_Part_0_...' does not start on a new line, consequently AXIS will not find it, it will just pass through it as it was never there.


Here is an hex fragment:

0040  81 35 26 fd 0d 2d 2d 2d  2d 2d 2d 3d 5f 50 61 72   .5&..--- ---=_Par

As you see, no '0a', only '0d'.



CONCLUSION:

I really don't know if this supposition that "boundary" markers always start on a new line or equivalently, after the first one are always preceded by a CR LF is something some AXIS person realized by reverse engineering the messages sent by the Sun Reference Implementation or if it is written in some standard (WS-I Basic Profile 1.0? JavaMail Spec on multipart messages? RFC for MIME? RFC for Mail? HTTP?).

Someone may be trying to prevent a chance that someone has an example of a marker in the text (with exactly the same random numbers?) but that example marker can very well be at the left margin. I really don't know why just not keep looking for the marker text itself (starting at the '-----').

On the other hand, I don't think Classpathx Mail should add CR LF in all other places and not in the Part headers. I believe that was not intentional.


Next step is to see where in Classpathx Mail these headers are emitted and make sure they are emitted with the proper CR LF sequence (at least for consistency).


And file a bug against AXIS to get rid of this CR LF requirement in the search and to provide some decent diagnostic message and perhaps some logging as the code is currently bare.



Regards to all,
Fernando








------------------------------------------------------------------------



--
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  address@hidden
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9




reply via email to

[Prev in Thread] Current Thread [Next in Thread]