Filling area under normal distribution curve from

A discussion forum for JFreeChart (a 2D chart library for the Java platform).
Locked
Higeath
Posts: 9
Joined: Sun Oct 23, 2016 3:44 pm
antibot: No, of course not.

Filling area under normal distribution curve from

Post by Higeath » Fri Oct 28, 2016 10:00 am

I am trying to fill out area e.g. from -infinity to -2 and from 2 to infinity. I have seen various topics about this but still cannot get this figure out with XYSeries it only works for one area shaded. Other topic suggested using 2 datasets and 2 renderers which I've tried without success as seen below. Is there a simple solution to this problem?

I want to achieve basically this : Image

Code: Select all

public class JFreeChartPanel extends JPanel {

    private final XYPlot plot;
    double mean = 0.0, sd = 1.0;
    XYDataset dataset = initDataset();
    NumberAxis domain;
    
    public JFreeChartPanel(){
        JFreeChart chart = ChartFactory.createXYLineChart(
            "Normal Distribution",
            "X", 
            "PDF", 
            dataset,
            PlotOrientation.VERTICAL,
            false,
            false,
            false
        );

        plot=chart.getXYPlot();
        domain = (NumberAxis) plot.getDomainAxis();
        domain.setAutoRangeStickyZero(false); //Fixes the margin issue with 0
        domain.setTickUnit(new NumberTickUnit(sd)); //Spacing on X-axis should be standard deviation + mean   
        
        XYSeriesCollection  dataset1 = new XYSeriesCollection();
        XYSeriesCollection  dataset2 = new XYSeriesCollection();
        
        XYSeries area1 = new XYSeries("area1");
        area1.add(-2, 0);
        area1.add(0, 0);
        ((XYSeriesCollection) dataset1).addSeries(area1);
        
        XYSeries area2 = new XYSeries("area2");
        area1.add(2, 0);
        area1.add(4, 0);
        dataset2.addSeries(area2);
        
        XYDifferenceRenderer renderer1 = new XYDifferenceRenderer();
        XYDifferenceRenderer renderer2 = new XYDifferenceRenderer();

        plot.setDataset(1, dataset1);
        plot.setRenderer(1, renderer1);
        plot.setDataset(2, dataset2);
        plot.setRenderer(2, renderer2);
        
        plot.setDomainAxis(domain);
        final ChartPanel chartPanel = new ChartPanel(chart);
        setLayout(new BorderLayout());
        add(chartPanel);
    }

    private XYDataset initDataset() {
        double minX=mean-(4*sd),maxX=mean+(4*sd);  //Minimum and Maximum values on X-axis (4 deviations)
        Function2D normal = new NormalDistributionFunction2D(mean, sd);
        XYDataset dataset = DatasetUtilities.sampleFunction2D(normal, minX, maxX, 100, "Normal");
        return dataset;
    }

John Matthews
Posts: 513
Joined: Wed Sep 12, 2007 3:18 pm

Re: Filling area under normal distribution curve from

Post by John Matthews » Fri Oct 28, 2016 11:21 am

Instead of a XYDifferenceRenderer, try a XYAreaRenderer with multiple series. A similar question is posed here.

paradoxoff
Posts: 1634
Joined: Sat Feb 17, 2007 1:51 pm

Re: Filling area under normal distribution curve from

Post by paradoxoff » Fri Oct 28, 2016 1:04 pm

Higeath wrote:Other topic suggested using 2 datasets and 2 renderers which I've tried without success as seen below.
Were these topics in the JFeeeChart forum? If so, do you have a link?
The reason for asking is that the suggestion is wrong. All you need is one XYDifferenceRenderer and one dataset with two series. The first series should display your function. The y values for the second series must simply be 0 for x < -gap/2.0 and x > gap/2.0 (where gap is the width of the "blank area" around the mean), and larger than the values for the first series otherwise.
You have the following components in your plot
- three datasets, one created in the initDataset-method, and two more created "manually" in the constructor. Your "dataset2" has a series "area2", but you never add values to that series (probably a typo).
- three renderers, one XYLineAndShapeRenderer created in the ChartFactory method, and two "manually" created. The two XYDifferenceRenderers are assigned to dataset1 and dataset2. Since dataset1 and dataset2 only have one series each, the XYDifferenceRenderers can't actually draw the "difference between series".

Here is a working example:

Code: Select all

public class NormalDistribution {

    double mean = 0.0, sd = 1.0;

    public NormalDistribution(JFrame frame) {
        XYDataset dataset = initDataset();
        JFreeChart chart = ChartFactory.createXYLineChart(
                "Normal Distribution",
                "X",
                "PDF",
                dataset,
                PlotOrientation.VERTICAL,
                false,
                false,
                false
        );

        XYPlot plot = chart.getXYPlot();
        NumberAxis domain = (NumberAxis) plot.getDomainAxis();
        domain.setAutoRangeStickyZero(false); //Fixes the margin issue with 0
        domain.setTickUnit(new NumberTickUnit(sd)); //Spacing on X-axis should be standard deviation + mean   

        XYSeriesCollection dataset1 = (XYSeriesCollection) dataset;

        XYSeries area1 = new XYSeries("area1");
        area1.add(-4, 0);
        area1.add(-1, 0);
        area1.add(-1, 0.4);
        area1.add(1, 0.4);
        area1.add(1, 0);
        area1.add(4, 0);
        dataset1.addSeries(area1);
        XYDifferenceRenderer renderer1 = new XYDifferenceRenderer();
        Color none = new Color(0, 0, 0, 0);
        renderer1.setNegativePaint(none);//hide the area where the values for the second series are higher
        renderer1.setSeriesPaint(1, none);//hide the outline of the "difference area"

        plot.setRenderer(0, renderer1);

        final ChartPanel chartPanel = new ChartPanel(chart);
        frame.getContentPane().add(chartPanel);
    }

    private XYDataset initDataset() {
        double minX = mean - (4 * sd), maxX = mean + (4 * sd);  //Minimum and Maximum values on X-axis (4 deviations)
        Function2D normal = new NormalDistributionFunction2D(mean, sd);
        XYDataset dataset = DatasetUtilities.sampleFunction2D(normal, minX, maxX, 100, "Normal");
        return dataset;
    }

    public static void main(String[] args) {
        JFrame frame = new JFrame("Normal Distribution Demo");
        new NormalDistribution(frame);
        frame.pack();
	    frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        frame.setVisible(true);
    }

}
Last edited by paradoxoff on Fri Oct 28, 2016 3:45 pm, edited 1 time in total.

Higeath
Posts: 9
Joined: Sun Oct 23, 2016 3:44 pm
antibot: No, of course not.

Re: Filling area under normal distribution curve from

Post by Higeath » Fri Oct 28, 2016 3:34 pm

Paradoxoff, thanks it works great.

I have one question though we are basically making a rectangle that goes over the highest value of Y so it is not visible correct?

paradoxoff
Posts: 1634
Joined: Sat Feb 17, 2007 1:51 pm

Re: Filling area under normal distribution curve from

Post by paradoxoff » Fri Oct 28, 2016 4:16 pm

Not exactly.
The "reactangle" is a step function with a change at those x values that should mark the left and right border of the "uncolored center range".
And the missing color in this range comes from two reasons:
1. The area below the series1 data is not painted at all since the y values of the series2 point are higher than those of the series1 points.
2. The area above the series 1 data or more precisely between the series 1 and series 2 data is painted in a fully transparent color.

Higeath
Posts: 9
Joined: Sun Oct 23, 2016 3:44 pm
antibot: No, of course not.

Re: Filling area under normal distribution curve from

Post by Higeath » Fri Oct 28, 2016 4:36 pm

Paradoxoff, thanks one last question if I may what if I wanted to clear the filling, I tried with xyseries.clear() but that makes the whole graph green since we use XYDifferenceRenderer.

paradoxoff
Posts: 1634
Joined: Sat Feb 17, 2007 1:51 pm

Re: Filling area under normal distribution curve from

Post by paradoxoff » Fri Oct 28, 2016 4:42 pm

Either exchange the XYDifferenceRenderer with a normal XYLineAndShapeRenderer, or call setPositivePaint(none) on the renderer, where "none" is the transparent color from my first post.

Higeath
Posts: 9
Joined: Sun Oct 23, 2016 3:44 pm
antibot: No, of course not.

Re: Filling area under normal distribution curve from

Post by Higeath » Fri Oct 28, 2016 5:07 pm

Paradoxoff, thank you again, the one issue I've noticed is that if I set the positivePaint to none then clear XYseries and add some more values and change the positivePaint back to green it fills out the entire graph.

Edit: The issue must occur because I change the values in series 0, I basically change the mean/sd

paradoxoff
Posts: 1634
Joined: Sat Feb 17, 2007 1:51 pm

Re: Filling area under normal distribution curve from

Post by paradoxoff » Fri Oct 28, 2016 6:40 pm

Whenever you change the data in the first series, you have to make sure that the values in the second series match the conditions that I mentioned in my first thread.

Higeath
Posts: 9
Joined: Sun Oct 23, 2016 3:44 pm
antibot: No, of course not.

Re: Filling area under normal distribution curve from

Post by Higeath » Fri Oct 28, 2016 6:51 pm

Values do match those requirements, I even tested it out with only one series so going from -infinity to e.g. 4 and then when I change the mean to something different the graph is fully filled out with green and changing the value of X again so from 4 to e.g. 3 doesn't affect the graph anymore.

Whenever I update mean/sd dataset is updated

Code: Select all

    public void setMean(double mean) {
       this.mean = mean;
       plot.setDataset(initDataset());
    }
    private XYDataset initDataset() {
        double minX=mean-(4*sd),maxX=mean+(4*sd);  //Minimum and Maximum values on X-axis (4 deviations)
        Function2D normal = new NormalDistributionFunction2D(mean, sd);
        XYDataset dataset = DatasetUtilities.sampleFunction2D(normal, minX, maxX, 100, "Normal");
        return dataset;
    }

    public void shadeArea(double x1, double x2, boolean isBetween){
        System.out.println(x1+" "+x2);
        areaRenderer.setPositivePaint(Color.GREEN);
        area.clear();
        if(isBetween){
            double abovePDF = Math.ceil(dataset.getYValue(0,dataset.getItemCount(0)/2)*100)/100; //Value that is rounded up to 2 decimal places for the mean PDF value
            area.add(mean-sd*4, 0);
            area.add(x1, 0);
            area.add(x1, abovePDF);
            area.add(x2, abovePDF);
            area.add(x2, 0);
            area.add(mean+sd*4, 0);

        }else{
            area.add(x1, 0);
            area.add(x2, 0);
        }
    }


The values in XYSeries are getting updated properly just the renderer for some reason fills out the whole area instead of taking series into account.

Even going back to original mean so from 1 to 2 (it fills up the entire area) then back to 1 it is still filled up and does not work.

paradoxoff
Posts: 1634
Joined: Sat Feb 17, 2007 1:51 pm

Re: Filling area under normal distribution curve from

Post by paradoxoff » Mon Oct 31, 2016 10:08 am

Do you have a complete snippet that is showing the wrong behaviour and that I can test?

paradoxoff
Posts: 1634
Joined: Sat Feb 17, 2007 1:51 pm

Re: Filling area under normal distribution curve from

Post by paradoxoff » Mon Oct 31, 2016 10:04 pm

You are recreating a new XYSeriesCollection whenever the mean and sd are changing, and you are updateing your XYSeries "area1".
I guess that the reason for the error is that you do not assign "area1" to your new dataset.
Here is an example that is behaving as expected. Note that I have used the proper math to calculate the maximum of the function, instead of retrieving that from the dataset.

Code: Select all

/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */
package jfree;

import java.awt.Color;
import org.jfree.chart.ChartFactory;
import org.jfree.chart.ChartPanel;
import org.jfree.chart.JFreeChart;
import org.jfree.chart.axis.NumberAxis;
import org.jfree.chart.axis.NumberTick;
import org.jfree.chart.axis.NumberTickUnit;
import org.jfree.chart.plot.PlotOrientation;
import org.jfree.chart.plot.XYPlot;
import org.jfree.chart.renderer.xy.XYDifferenceRenderer;
import org.jfree.data.function.Function2D;
import org.jfree.data.function.NormalDistributionFunction2D;
import org.jfree.data.general.DatasetUtilities;
import org.jfree.data.xy.XYDataset;
import org.jfree.data.xy.XYSeriesCollection;
import org.jfree.data.xy.XYSeries;
import javax.swing.JFrame;

/**
 *
 * @author peter
 */
public class NormalDistribution {

    private XYPlot plot;

    private XYSeriesCollection dataset;

    private NumberAxis domainAxis;

    public NormalDistribution(JFrame frame, double mean, double sd) {
        dataset = new XYSeriesCollection();
        JFreeChart chart = ChartFactory.createXYLineChart(
                "Normal Distribution",
                "X",
                "PDF",
                dataset,
                PlotOrientation.VERTICAL,
                false,
                false,
                false
        );

        plot = chart.getXYPlot();
        domainAxis = (NumberAxis) plot.getDomainAxis();
        domainAxis.setAutoRangeStickyZero(false); //Fixes the margin issue with 0

        XYDifferenceRenderer renderer1 = new XYDifferenceRenderer();
        Color none = new Color(0, 0, 0, 0);
        renderer1.setNegativePaint(none);//hide the area where the values for the second series are higher
        renderer1.setSeriesPaint(1, none);//hide the outline of the "difference area"

        plot.setRenderer(0, renderer1);
        updateData(mean, sd);
        final ChartPanel chartPanel = new ChartPanel(chart);
        frame.getContentPane().add(chartPanel);
    }

    private XYSeries createFunctionSeries(double mean, double sd) {
        double minX = mean - (4 * sd), maxX = mean + (4 * sd);  //Minimum and Maximum values on X-axis (4 deviations)
        Function2D normal = new NormalDistributionFunction2D(mean, sd);
        XYSeries functionSeries = DatasetUtilities.sampleFunction2DToSeries(normal, minX, maxX, 100, "Normal");
        return functionSeries;
    }

    private XYSeries createAreaSeries(double mean, double sd) {
        XYSeries result = new XYSeries("");
        double fmax = 1 / sd / Math.sqrt(2 * Math.PI) * 1.05;
        result.add(mean - 4 * sd, 0);
        result.add(mean - sd, 0);
        result.add(mean - sd, fmax);
        result.add(mean + sd, fmax);
        result.add(mean + sd, 0);
        result.add(mean + 4 * sd, 0);
        return result;
    }

    public void updateData(double mean, double sd) {
        dataset.removeAllSeries();
        dataset.addSeries(createFunctionSeries(mean, sd));
        dataset.addSeries(createAreaSeries(mean, sd));
        domainAxis.setTickUnit(new NumberTickUnit(sd));
    }

    public static void main(String[] args) {
        JFrame frame = new JFrame("Normal Distribution Demo");
        NormalDistribution nd = new NormalDistribution(frame, 0.0, 1.0);
        frame.pack();
        frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        frame.setVisible(true);
        for (int i = 1; i < 20; i++) {
            nd.updateData(0, i / 5.0);
            try {
                Thread.sleep(2000);
            } catch (InterruptedException ie) {
                ie.printStackTrace();
            }
        }
    }

}


Locked