[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Classpathx-javamail] GNU Mail, AXIS and CR LF or why (possibly) some mu
From: |
Fernando Nasser |
Subject: |
[Classpathx-javamail] GNU Mail, AXIS and CR LF or why (possibly) some multipart AXIS tests fail |
Date: |
Thu, 19 May 2005 23:57:35 -0400 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 |
Florent and I have been tracking down a problem that causes 10 tests to
fail when the GNU Mail replaces the Sun Reference Implementation of
JavaMail. All these failures are related to handling multipart messages
and this is the single thing that is preventing us to confirm this change.
With a test program we could narrow down the problem. Here are some facts:
1) Replacing the GNU Mail classes in client.jar with the Sun JavaMail
classes eliminates the problem. I was able to determine that only the
mailapi.jar is enough and Florent went even further, just replacing the
GNU Mail classes that use GNU Inetlib ( a low level network library)
with the corresponding classes from JavaMail (thus avoiding the use of
Inetlib also solves the problem.
Here is how I am testing the fail / no-fail cases at the moment:
java -jar /usr/share/jonas/lib/client.jar output/clients/client.jar
is a 4.4 client.jar that contains GNU Mail (+ Inetlib) ==> FAIL
java -jar jonas-client.jar output/clients/client.jar
which is a 4.3.5 client.jar that contains Sun JavaMail RI. ==> PASS
Corollary: Sun RI JavaMail and GNU Mail must be sending different
things through the HTTP messages. The Sun one is the (de facto) correct.
2) I found out that the error occurs because an AXIS routine fails to
identify the proper boundaries of the multipart message and runs out of
the InputStream (obtained from the HttpServletRequest). Here is the
address of the failure:
org.apache.axis.attachments.MultiPartRelatedInputStream.<init>(Ljava.
lang.String;Ljava.io.InputStream;)V(MultiPartRelatedInputStream.java:336)
This constructor is called as a consequence of the use of
javax.xml.soap.MessageFactory.createMessage( with arguments ); which is
used to "internalize" the data received back into a SOAPMessage object.
This is the textbook example of how this should be done.
If you look at this (very fragile) code you will notice one very
important comment:
// after the first boundary each boundary will have a cr lf at the
beginning since after the data in any part there
// is a cr lf added to put the boundary at the begining of a line.
And so it is coded, after the first marker is found, a CR LF is added to
the start of the text ("boundary") being searched.
Corollary: If a boundary marker that is not the first one does not
start with CR LF it will not compare "equal" and will not be found by
AXIS MultiPartRelatedInputStream constructor.
3) Thanks to Vadim, I learned how to use Ethereal and could go back in
time to some previous life when I used to do such things frequently.
After some time looking and after learning some features of the tool, I
concluded that, except for the random numbers and the point where the
packets were split, the messages sent by Sun JavaMail RI and GNU Mail
were almost identical. A-L-M-O-S-T.
For some reason, perhaps some feature of Inetlib, perhaps because the
wrong Inetlib function is being called, the message generated when GNU
Mail is used does not have the LF on the headers of the parts, only CR.
As headers is an overloaded term in SOAP context, here is an example:
From Sun JavaMail RI:
^M
------=_Part_0_69510262.1116534622553^M
Content-Type: text/xml; charset=UTF-8^M
Content-Transfer-Encoding: binary^M
Content-Id: <334C15FC1CBBA36BFF64CF55A4016EF2>^M
^M
<soapenv:Envelope
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"><soapenv:Body/></soapenv:Envelope>
^M
------=_Part_0_69510262.1116534622553^M
Content-Type: text/html^M
Content-Transfer-Encoding: binary^M
Content-Id: <D8DF3BE596E312ED354472AC67B554C2>^M
^M
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">^M
<html>^M
(...)
Note that all "boundaries" (------=_Part_0_69510262.1116534622553) start
on a new line, following a CR LF
Here is a fragment in in hex:
0040 81 34 f4 7f 0d 0a 2d 2d 2d 2d 2d 2d 3d 5f 50 61 .4....-- ----=_Pa
Note the "0d 0a" before the '------=_Pa'
In the GNU Mail case, the LFs are gone:
^M
------=_Part_0_70279660.1116534635530^M
Content-Type: text/xml; charset=UTF-8^MContent-Transfer-Encoding:
binary^MContent-Id:
<46B3E0073E572CF16B063980ECEA7587>^M^M<soapenv:Envelope
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"><soapenv:Body/></soapenv:Envelope>
^M------=_Part_0_70279660.1116534635530^M
Content-Type: text/html^MContent-Transfer-Encoding: binary^MContent-Id:
<195BFE4772082B5EEAC9D1126DAA6DCE>^M^M<!DOCTYPE HTML PUBLIC "-//W3C//DTD
HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">^M
<html>^M
(...)
If you look closely, you will notice that '------=_Part_0_...' does not
start on a new line, consequently AXIS will not find it, it will just
pass through it as it was never there.
Here is an hex fragment:
0040 81 35 26 fd 0d 2d 2d 2d 2d 2d 2d 3d 5f 50 61 72 .5&..--- ---=_Par
As you see, no '0a', only '0d'.
CONCLUSION:
I really don't know if this supposition that "boundary" markers always
start on a new line or equivalently, after the first one are always
preceded by a CR LF is something some AXIS person realized by reverse
engineering the messages sent by the Sun Reference Implementation or if
it is written in some standard (WS-I Basic Profile 1.0? JavaMail Spec
on multipart messages? RFC for MIME? RFC for Mail? HTTP?).
Someone may be trying to prevent a chance that someone has an example of
a marker in the text (with exactly the same random numbers?) but that
example marker can very well be at the left margin. I really don't know
why just not keep looking for the marker text itself (starting at the
'-----').
On the other hand, I don't think Classpathx Mail should add CR LF in all
other places and not in the Part headers. I believe that was not
intentional.
Next step is to see where in Classpathx Mail these headers are emitted
and make sure they are emitted with the proper CR LF sequence (at least
for consistency).
And file a bug against AXIS to get rid of this CR LF requirement in the
search and to provide some decent diagnostic message and perhaps some
logging as the code is currently bare.
Regards to all,
Fernando
- [Classpathx-javamail] GNU Mail, AXIS and CR LF or why (possibly) some multipart AXIS tests fail,
Fernando Nasser <=