classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: An interessting change for shared char[] in String/StringBuffer


From: Eric Blake
Subject: Re: An interessting change for shared char[] in String/StringBuffer
Date: Thu, 08 Apr 2004 07:24:13 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624

Tom Tromey wrote:
">" == Chr Ullenboom <address@hidden> writes:


From http://java.sun.com/j2se/1.5.0/jcp/beta1/index.html#java.lang:
"In this release, the sharing between String and StringBuffer has been
eliminated."
Does this mean to change to change the implementation in Classpath too? I
like this optimization...


I don't think we necessarily have to change this.  IMO it would depend
on whether the change is observable by user code.  Our implementation
doesn't always share, anyway.  It only shares if the buffer is mostly
in use.

Part of Sun's rationale for their change is that in 1.5, they introduced java.lang.StringBuilder, a non-synchronized copy of StringBuffer with otherwise identical semantics. Similar to gnu.java.lang.StringBuffer used in gcj, if StringBuilder exists, it allows the compiler to emit more efficient string concatenation (and jikes already knows how to use it). My understanding of Sun's implementation of StringBuilder is that it is rather simplistic (based on a bug report to Sun's site complaining that the javadoc was misleading because of the non-public superclass) - they renamed the old StringBuffer into a new package-private class java.lang.AbstractStringBuffer (or some such name) for all the implementation, then created StringBuffer and StringBuilder as public extension classes with no further implementation (other than the fact that all the StringBuffer methods add synchronization). So perhaps they made the change on sharing the char[] because their class-shuffling broke something.

I agree with Tom that we would have to benchmark it to see if sharing or not sharing is more efficient, before blindly choosing one way over the other just to match Sun. If I understand correctly, back in JDK 1.0, Sun did NOT use char[] sharing - it was added later as an optimization before JIT compilers were as good as they are now (and now Sun claims to be deleting it as an optimization). Also, we will have to be careful that we handle serialization of StringBuffer correctly, whichever way we choose.

I also wonder if the following implementation would be more efficient. In the common case, StringBuffer/StringBuilder is used for appends, and then converted to a String just before being discarded. Currently, for every append, we adjust the underlying char[] and copy the appended String into that array. Would it be better to just build a String[] that caches all the appended Strings, and then create a single char[] at the time toString() is called, rather than updating the char[] for every append()? Of course, we would have to create the char[] for any non-append() method. And one of the disadvantages of this method is that we end up creating intermediate Strings when we append primitive types, whereas the current implementation can update the char[] without creating any intermediate objects. I haven't coded this up to experiment on the difference, but it would be an interesting experiment.

--
Someday, I might put a cute statement here.

Eric Blake             address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]