# Draw an Ellipse Summary Plot in R

- Correlation chart of two set (x and y) of data.
- Using Quantiles.
- Visualize the effect of factor.

Fig. 1. Ellipse summary plot

The ellipse plot is a family of box and whisker plot, and resembles to the double box plot. An ellipse plot of five number summary is given with following steps.

- Draw an inscribed ellipse to the quartile box of a double box plot
- Draw an inscribed ellipse to the range of x and y.

Fig. 2. Ellipse plot overlayed on Double box plot

Centers of ellipses are bounded to the median. To do that, an ellipse here strictly is not a single ellipse.

Fig. 3. Four of inscribed quarter ellipses

An ellipse here is synthesized from four quarter ellipses. As shown in Figure 3, a box is divided into four rectangles by lines through the median. A quarter ellipse is drawn for each rectangle. The bluish rectangle is indicating one of these quarters.

Ellipses are drawn from the center (median) to outward of quantiles.

For quartiles (five number summary), following two ellipses are drawn.

- 1st Quartile to 3rd Quartile (interquartile range)
- Minimum to Maximum (range)

For octiles (nine number summary), following four ellipses are drawn.

- 3rd Octile to 5th Octile
- 2nd Octile to 6th Octile (shown in Figure 3)
- 1st Octile to 7th Octile
- Minimum to Maximum

Benefits of using the ellipse plot come from benefits of quantile summary. Figure 4 shows a comparison between ellipse plots and scatter plots.

Fig. 4. Comparison between ellipse plots and scatter plots

A distribution shape is clearer at the ellipse plot, because the information is organized. A raw scatter plot is too sparse when the sample number is small, and is too overstuffed when the sample number is large (Figure 4).

Fig. 5. Comparison between ellipse plots and double box plot

Comparing with a double box plot, an ellipse plot is useful when comparing distributions because its shape is simple. Though a double plot is tightly binded with five number summary, an ellipse plot can use other quantiles, nine number summary and more (Figure 5).

Fig. 6. Ellipse plot and histogram on each axis

An ellipse plot looks like something like a integrated histogram of x and y axes (Figure 6).

But it is not.

Fig. 7. Ellipse plot for uniform distribution

Figure 7 shows an ellipse plot of a normal distributed data set (x) and a uniform distributed (white noise) data set (y). Though the histogram of y axis is flat, the ellipse plot shows a triangular peak along y axis.

So, an ellipse plot shows a steeper slope toward the center. This indicates an integrate from the center to outside, that is the cumulative distribution.

This is awesome, how do you suppress the factor labels on the plot?

Hi, Will.

Unfortunately, there’re no options to do that.

The only way at this moment is to check out the source and edit it to suppress labels.

https://code.google.com/p/cowares-excel-hello/source/browse/trunk/contributed/cran/elliplot/R/ellipseplot.R

Factor labels are written by the line 152-154 of the source text at the above url.

Hi Tomizono,

very nice visualization. Are you planning to create an R package out of your routines? I think it would be useful for many people.

Hi, Holger,

Yes, I will try that.

Thanks for your advice.

I made a package elliplot and released it.

https://tomizonor.wordpress.com/2013/09/28/elliplot-1-1-0/

I forgot to show where to get the ellipse plot source. You can download it from http://code.google.com/p/cowares-excel-hello/wiki/ellipseplot_r

