Wednesday, 10 November 2010

The cost of Java's finalize( ) method

We all know that the finalize method is generally bad, and shouldn't be used. However this is mostly for reasons of determinism and the fact that it is in no way like a c++ destructor, which means most of the reasons that it could be useful turn out to cause a bug (removing listeners).

What about the performance penalty? I was genuinely shocked at how bad performance got when using finalizable objects. The code snippet I used is below...


public class FinalizerTest
{
  public static void main(String[] args)
  {
    final long currentTimeMillis = System.currentTimeMillis();
   
    for (int i = 0; i < 1000000; i++)
    {
      new MyObject();
    }
   
    System.out.println("done in " + (System.currentTimeMillis() - currentTimeMillis) + "ms instances of MyObject created is " + MyObject.i + " and disposed " + MyObject.disposed);
  }
 
  static class MyObject
  {
    static int i = 0, disposed = 0;
   
    public MyObject()
    {
      i++;
    }
   
    @Override
    protected void finalize() throws Throwable
    {
      super.finalize();
    }
  }
}

Wit h the finalize method commented out and -verbose:gc on the command line the output was...

[GC 512K->106K(1984K), 0.0009172 secs]
[GC 618K->106K(1984K), 0.0003271 secs]
[GC 618K->106K(1984K), 0.0002747 secs]
[GC 618K->106K(1984K), 0.0000591 secs]
[GC 618K->106K(1984K), 0.0000533 secs]
[GC 618K->106K(1984K), 0.0000597 secs]
[GC 618K->106K(1984K), 0.0000590 secs]
[GC 618K->106K(1984K), 0.0000589 secs]
[GC 618K->106K(1984K), 0.0000587 secs]
[GC 618K->106K(1984K), 0.0000585 secs]
[GC 618K->106K(1984K), 0.0000586 secs]
[GC 618K->106K(1984K), 0.0000585 secs]
[GC 618K->106K(1984K), 0.0000586 secs]
[GC 618K->106K(1984K), 0.0000585 secs]
[GC 618K->106K(1984K), 0.0000584 secs]
done in 15ms instances of MyObject created is 1000000

With the finalize method implemented and -verbose:gc on the command line the output was...

[GC 512K->458K(1984K), 0.0033486 secs]
[GC 970K->968K(1984K), 0.0042475 secs]
[GC 1480K->1414K(1984K), 0.0032794 secs]
[GC 1926K->1861K(2496K), 0.0029895 secs]
[Full GC 1861K->1861K(2496K), 0.0178491 secs]
[GC 2373K->2372K(3680K), 0.0050145 secs]
[GC 2884K->2870K(3680K), 0.0036973 secs]
[GC 3382K->3369K(3936K), 0.0034726 secs]
[Full GC 3369K->3098K(3936K), 0.0227427 secs]
[GC 3610K->3609K(5744K), 0.0044828 secs]
[GC 4121K->4109K(5744K), 0.0036602 secs]
[GC 4621K->4556K(5744K), 0.0030668 secs]
[GC 5068K->5003K(5744K), 0.0032344 secs]
[GC 5515K->5450K(6000K), 0.0029319 secs]
[Full GC 5450K->4573K(6000K), 0.0283003 secs]
[GC 5085K->5084K(8200K), 0.0055667 secs]
[GC 5596K->5582K(8200K), 0.0041402 secs]
[GC 6094K->6085K(8200K), 0.0043665 secs]
[GC 6597K->6532K(8200K), 0.0028525 secs]
[GC 7044K->6979K(8200K), 0.0026690 secs]
[GC 7491K->7427K(8200K), 0.0026212 secs]
[GC 7939K->7874K(8456K), 0.0030884 secs]
[Full GC 7874K->6583K(8456K), 0.0454533 secs]
[GC 7415K->7414K(11872K), 0.0089436 secs]
[GC 8246K->8231K(11872K), 0.0063717 secs]
[GC 9063K->9054K(11872K), 0.0055485 secs]
[GC 9886K->9820K(11872K), 0.0046484 secs]
[GC 10652K->10587K(11872K), 0.0046174 secs]
[GC 11419K->11354K(12256K), 0.0046537 secs]
[Full GC 11354K->10217K(12256K), 0.0540124 secs]
[GC 11433K->11432K(18376K), 0.0124211 secs]
[GC 12648K->12623K(18376K), 0.0088227 secs]
[GC 13839K->13795K(18376K), 0.0084357 secs]
[GC 15011K->14882K(18376K), 0.0061044 secs]
[GC 16098K->15969K(18376K), 0.0063174 secs]
[GC 17185K->17056K(18376K), 0.0068424 secs]
[GC 18272K->18143K(19400K), 0.0071953 secs]
[Full GC 18143K->16065K(19400K), 0.0836026 secs]
[GC 17921K->17919K(28824K), 0.0205388 secs]
[GC 19775K->19750K(28824K), 0.0140122 secs]
[GC 21606K->21576K(28824K), 0.0124343 secs]
[GC 23432K->23239K(28824K), 0.0098796 secs]
[GC 25095K->24901K(28824K), 0.0105540 secs]
[GC 26757K->26564K(28824K), 0.0100624 secs]
[GC 28420K->28226K(30104K), 0.0102745 secs]
[Full GC 28226K->24606K(30104K), 0.1285938 secs]
[GC 27422K->27421K(44148K), 0.0293065 secs]
done in 2687ms instances of MyObject created is 1000000

Yikes, we've gone from 15ms to 2687ms! But this is java profiling after all and nothing is quite as it seems. If the call to super.finalize() is commented out the test takes zero milliseconds, I believe this is happening because the finalize method is now dead code so the JIT is removing it. If I put a static int and increment it every time finalize is called the time varies wildly from 1900ms to 3110ms.

Thursday, 4 November 2010

Finding if a number is a prime

Update 06/04/2012
Modified the for loop as proposed by Alan (comment #1) which makes the method roughly twice as fast.