classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: java.io.DataInputStream.readLine misbehaviour


From: Mark Wielaard
Subject: Re: java.io.DataInputStream.readLine misbehaviour
Date: Sun, 02 Nov 2003 18:55:44 +0100

Hi,

On Sun, 2003-11-02 at 17:33, Dalibor Topic wrote:
>
> That's a bug in Classpath then. Trying to 'read ahead' after \r fails on 
> for those systems whose 'end of line' is a plain \r. 
> [...]
> In fact, while you're at it, check out the rationale behind kaffe's 
> implementation as described in this thread:
> http://www.kaffe.org/pipermail/kaffe/2002-October/040501.html

Thanks for that pointer. You already put a lot of thought into this.

> > BTW I don't think we have to worry about the unsynchronized thing since
> > streams are already not thread safe.
> 
> They should be. My experience from working on kaffe's readers and 
> writers and testing against JDK was that the JDK implementation was very 
> thread safe, whereas kaffe's IO had thread safety kludged in afterwards 
> by me for some classes. From what I remember, Classpath code wasn't 
> thread safe at all, either.

They have to be? O, bummer. That would be so brain-dead and makes
creating an efficient implementation impossible since you will then have
to do lots of synchronization :{

> For what it's worth, yes, there is code out there that expects threads 
> to be able to read from readers/writers and not trip over their feet ;)

But what is the expected behavior? What does "not trip over their feet"
mean in this context? Does it mean that if they synchronize on the
stream and read/write one after another that they will get "real" return
values? Or does it mean that if some threads read/write at random
to/from without any synchronization that they still expect some coherent
return values? And you talk about readers/writers, does that mean that
this behavior is only guaranteed for character streams or for all
possible streams?

> AFAIK, the Java APIs explicitely mention when a method/class is not 
> thread safe, so by default we should assume that APIs are to be thread 
> safe.

I couldn't find this in my Java Class Library book (and don't have any
meaningful internet connection to check other sources). At least the
descriptions of InputStream, FilterInputStream and DataInputStream don't
seem to make any claims about thread safety.

> I mean, it's built in the laguage, why would you want to limit 
> yourself to single threads when you're doing something as important as IO?

I agree that you should use multiple threads whenever that is beneficial
to your programming design? But I don't think that it is unreasonable to
ask the programmer to specify the synchronization behavior of the
program since that are often high-level aspects if the design and
low-level fine-grained locking will often not be beneficial to the users
(IMHO).

> > That is overhead for every FilterInputStream, that should be avoided.
> > But I agree that the Classpath way is not very clean.
> > 
> > Would a solution be to do the following in the constructor?
> > 
> >   public DataInputStream (InputStream in)
> >   {
> >     super (in.markSupported() ? in : new BufferedInputStream(in, 1));
> >   }
> 
> No. Even if the superstream supports marking, then a mark/rewind based 
> solution would delete a previous mark set by user of DataInputStream 
> class each time a \r occurs. That would be a quite ugly bug to find.

Cannot be that hard to find such a bug, you found it before it was even
implemented :)

> > Another question is how often (ever?) does this (mixing readLine() with
> > other DataInputStream and/or FilterInputStream calls happen in real
> > programs. The regression test of kaffe looks contrived, it even uses
> > StringInputStream which is an ugly (and thankfully deprecated) class.
> 
> Happened in real life with very popular libraries like xerces. See the 
> kaffe mailing list thread. User was not pleased at all, and wanted to 
> revert back to broken 'read ahead, and unread' method. I had to figure 
> out a way to fix both issues, that's the whole story. If someone can 
> find a more elegant way to do it, that doesn't create any new bugs, I'm 
> all ears. I've been there before and spent some time researching this, 
> though. You may have more luck. ;)

OK. Let me try to summarize the behavior we want so we can at least
create some good tests:

DataInputStream.readLine():
- Should not block when it has seen at least a \r but return as soon as
  possible even when it cannot be sure that the next character is or
  isn't a \n to prevent programs from blocking/deadlocking.
- If there is a unconsumed \n in the stream right after a readLine()
  then that \n should be transparently consumed for all these cases:
  - readLine() is called again.
  - another DataInputStream read method is called.
  - a non-overridden (Filter)InputStream is called.
  - a method in the originally wrapped InputStream is called.

That will be an ugly nut to crack. Especially how to guarantee that last
requirement is nasty. And the method is (rightly) deprectated since
programmers should BufferedReader.readLine() anyway (specified since
1.1). Sigh.

But having a good test case will be very valuable start.
Any volunteers :)

Cheers,

Mark

Attachment: signature.asc
Description: This is a digitally signed message part


reply via email to

[Prev in Thread] Current Thread [Next in Thread]