[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: HTTPURLConnection.connect() buffers its entire input.
From: |
Chris Burdess |
Subject: |
Re: HTTPURLConnection.connect() buffers its entire input. |
Date: |
Fri, 9 Sep 2005 09:13:48 +0100 |
David Daney wrote:
It seems the the current implementation of HTTPURLConnection.connect()
buffers the entire response before returning.
Is that a correct analysis?
Yes.
This can be problematical if the content is larger than the heap. It
is even worse than that as it makes a copy of the content, so the
content can only be half as large as the heap.
Does anyone know the rational behind doing it this way?
Our implementation uses the inetlib HTTP client in order to leverage
numerous HTTP features such as chunked and compressed transfer-codings,
TLS, and HTTP 1.1.
The design of the inetlib HTTP client is based on callbacks. You
register a listener to receive notification of HTTP response data,
rather than pulling the data yourself. This leaves the client in proper
control of the stream and permits correct handling of HTTP persistent
connections (reuse of the same TCP connection for multiple HTTP
requests).
The design of the URLConnection API is pull-based. Therefore we either
have to buffer an entire response before returning, or use multiple
threads, a pipe, and a much more complex implementation to manage
cleanup of resources. Also note that with HTTP 1.1 chunked encoding,
you can have headers after the response body, which is not something
that most naive developers will expect. This means that in the
non-buffered implementation you could have
connection.getHeader("My-Header"); // null
connection.getInputStream();
// read until -1
connection.getHeader("My-Header"); // non-null
In practice I haven't seen this in many servers, but it is still a
possibility.
Tom Tromey and I have discussed the possibility of this non-buffered
implementation and of a hybrid model which uses a heuristic based on
the content length to decide which of these implementations to use, but
we haven't really had time to thrash it all out yet.
If you are dealing with streaming servers or with very large responses,
you probably shouldn't be using the URLConnection API in any case -
consider using the inetlib client directly as it will be more
efficient.
--
Chris Burdess
"They that can give up essential liberty to obtain a little safety
deserve neither liberty nor safety." - Benjamin Franklin