Performance Suggestions

A discussion forum for JFreeChart (a 2D chart library for the Java platform).
Locked
FunkyELF
Posts: 11
Joined: Wed Nov 29, 2006 3:48 pm

Performance Suggestions

Post by FunkyELF » Wed Apr 18, 2007 8:34 pm

I'd like some performance suggestions.
I'm running into java.lang.OutOfMemoryError: Java heap space errors.

I'm reading in a 7 Megabyte file which has 122 different data sets. Each set has 1,600 XYData points.

When I open one of these files I read all data sets and run some very basic analysis on them. I used to keep each dataset as a new class DataFile which extends XYSeries. Now because of the performance issues and the memory errors, when I open the file after reading each dataset and running the little analysis on it I run the XYSeries.clear() method. I am also storing the seek position along with each dataset and using Java's RandomAccessFile to re-read the data whenever that particular dataset needs to be viewed.

I am guessing that when I was keeping all of the data in memory the size blew up from 7 Megabytes. Once they're stored in the ArrayList in XYDataSeries as XYDataItems they're probably not that small anymore. I see that the XYDataItem has 2 Number Objects. So I was really storing 122*1600*2 or 390400 objects in memory.

I am wondering if there is a better approach to this type of problem.

Should I store the entire data file in memory in a better format such as an ArrayList of doubles, then create the charts as needed? That way I'm not going to the disk every time I need to show different data?

Thanks for the suggestions.
~Eric

Taqua
JFreeReport Project Leader
Posts: 698
Joined: Fri Mar 14, 2003 3:34 pm
Contact:

Post by Taqua » Thu Apr 19, 2007 10:43 am

As usual: Default implementations are good for small datasets. But in general, they are definitly unsuitable for greater amounts.

Implement your own XYDataSet, ignore the XYSeries and all the other classes and you shall see your performance go up again. Now, lets assume the worst, that your data-points will not share the same x-values. In that case, you would have to store 122 series of XY-points (which can be represented as 2-dimensional array), and therefore you would have to create a 2x1600 double-array (consuming 2x1600x16 bytes) (A double consumes 16-bytes).

Each series would have its own 2-dimensional-double-array, and therefore your dataset will consume 122x1600x2x16 bytes for your data (+ at least 122x2x8 bytes as JVM managed overhead for the arrays).

Therefore your dataset would consume roughly 6 or 7mb if you use primitive doubles. If you use java.lang.Double objects instead, your required size would increase to 8 mb. I would suggest to store Double-objects so that you dont have to create new objects on the fly all the time (as the XYDataSet interface requires you to return java.lang.Number instances).

As the number of points is already known, I would use primitive object-arrays instead. (Or if the size is not known, then use an List while reading the values in, but then switch to arrays in the data-set implementation for faster access.)

So if you use a sane (ie: Not the default) implementation, then you can easily hold the whole data in memory.

david.gilbert
JFreeChart Project Leader
Posts: 11734
Joined: Fri Mar 14, 2003 10:29 am
antibot: No, of course not.
Contact:

Post by david.gilbert » Thu Apr 19, 2007 11:46 am

Taqua wrote:I would suggest to store Double-objects so that you dont have to create new objects on the fly all the time (as the XYDataSet interface requires you to return java.lang.Number instances).
Actually, you can implement an XYDataset using double primitives as the backing store - for example, see DefaultXYDataset. And the JFreeChart classes will only call the getXValue() and getYValue() methods that return these primitives directly, not the getX() and getY() methods that must return Number objects. This avoids having to create Number object instances over and over.

Those Number object methods are still useful, for example if you plan to display you data in a JTable - but if you are doing that, then Thomas's advice does apply, you should use a dataset that does store the values as Number objects.
David Gilbert
JFreeChart Project Leader

:idea: Read my blog
:idea: Support JFree via the Github sponsorship program

FunkyELF
Posts: 11
Joined: Wed Nov 29, 2006 3:48 pm

Post by FunkyELF » Thu Apr 19, 2007 2:24 pm

Thanks for the replies.

Right now my DataFile class extends XYSeries. I'll look into extending XYDataset instead and skipping the series as suggested.

In the mean time I think I found my source of the memory leak.

I'm going to start a new topic and edit this post to include a link to the new one.

EDIT:
Here is the link
http://www.jfree.org/phpBB2/viewtopic.php?p=60172#60172

Locked