[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: String.equals optimisation

From: Simon Kitching
Subject: Re: String.equals optimisation
Date: Tue, 12 Jul 2005 17:25:08 +1200

Hi Archie,

On Mon, 2005-07-11 at 20:27 -0500, Archie Cobbs wrote:
> Simon Kitching wrote:.
> > * Class.getName returns strings that have been interned. I don't
> >   think this is explicitly required by the java specs but is
> >   certainly true for Sun's JVM and seems likely to be done by
> >   any sensible JVM.
> You definitely make some good arguments, but this one is not
> neccesarily true. In fact, I'd argue a JVM that interns every
> class' name (even if only on demand) is potentially wasting
> a bunch of heap space.

I'm assuming that the Class object would contain a reference to the
interned string, so there is only one copy of the string, ie somewhere
in the ClassLoader.defineClass method there is this sort of thing:

public Class defineClass(String name, ...)
  Class newClass = new Class();

The extra space used for interning is therefore just a single extra
reference (as a reference to the string is contained in both the Class
object and the String class internal pool). Yes that is a little space
wasted, but not a bunch.

> I.e., is there something special about class names which means
> they should be treated differently from any other String randomly
> created and used in a Java application? (rhetorical question)
> Otherwise, why not intern all Strings? Etc.

I do wonder why Java specified that all literal and constant strings in
a class file are automatically interned. Being able to compare literals
with == is not that useful. 

Maybe the most important goal was to compress class representations in
memory. In particular, when class A references a static final String
field from some other class, A gets a copy of that string not a
reference to it, so without the intern mechanism to merge instances back
together again such strings would get duplicated in ram when the classes
were loaded.

I'd be interested to hear of other reasons for Java's requirement to
intern all literal strings and constants.

But, strangely, I do think that interning classnames (which is optional)
is particularly useful. When ClassLoader resolves a class it has loaded,
it must do lots of lookups to find other classes. Surely being able to
do this using identity to compare classnames would be a significant
timesaver. And class resolution is the biggest issue in application
startup time, so improving this seems like a good idea.

In the general case, whether interning a string proves useful or not
depends upon the usage pattern for that string. I guess the usage
patterns for class literals and classnames are pretty well known: long
lifetimes, and either:
* comparisons against them are common, or
* duplication of the content is common. 
But only users know the usage patterns for the dynamic string objects
they create, so it's up to them to decide when to use intern...

> In any case, to provide two concrete counter-examples:
>    $ cat >
>    public class zz {
>      public static void main(String[] args) {
>          zz z = new zz();
>          System.out.println(z.getClass().getName() == "zz");
>      }
>    }
>    $ javac
>    $ java zz
>    true
>    $ jc -Xint zz
>    false
>    $ jamvm zz
>    false


$ gij

$ gcj -o zz --main=zz
$ zz

Note for others reading this thread: all this is really irrelevant
anyway. The classname stuff was just one example I suggested for when
strings being compared might perform better with String.equals optimised
for comparison by identity. As shown above, sun's java might benefit
from this; other JVMs currently won't. But it's only one example.

> On the other hand, comparing reference equality is very low cost,
> so it seems like adding "==" to equals() might make good sense.
> Of course, the "real" answer lies in empirical testing (something
> I can't claim to have done).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]