## Interactive Main and Variance

This article tries to explain multiple concepts from statistics using a small Javascript illustration of the correlation of two variables.

The normal distribution in one dimension is described by the probability distribution

$$f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$

where $$\mu$$ is the mean and $$\sigma$$ is the standard deviation. While the mean is easily understood, the standard deviation measures how spread out the distribution is. A large $$\sigma$$ implies that numbers further from the mean are more likely.

## The Interactive Experiment

You can add points to the 'canvas' below by clicking with the mouse. After adding a point the mean and variation is recomputed and displayed. In the image you can see the mean displayed as a red dot. Around the mean there is two ellipsis showing the 'iso-bar' for (\sigma) and (2\sigma) - something I'll explain in more details below.

The numerical values of the mean are shown in the first table next to the canvas. The second table shows the computed covariance.

### Points

You must use a browser which supports the canvas element.

### Mean and covariance

You may edit the mean and covariance before clicking generate. However, there is no checks to detect if the data is invallid.

xy

## Computations

The mean can be computed as

$$\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i,$$ and $$\bar{y} = \frac{1}{n} \sum_{i=1}^n y_i.$$

However, we use an updating formula.

The elements of the covariance matrix are computed using the means via the following formulas

$$C_{11} = \frac{1}{n} \sum_{i=1}^n (x_i-\bar{x})^2$$

$$C_{22} = \frac{1}{n} \sum_{i=1}^n (y_i-\bar{x})^2$$

$$C_{12} = C_{21} = \frac{1}{n} \sum_{i=1}^n (x_i-\bar{x})(y_i-\bar{y}).$$

However, for illustration purposes I need something that corresponds to the standard deviation. Thus, I need to compute the square root of $$C$$. For an 2 by 2 matrix as $$C$$ there exists an easy formula, see e.g. this Wikipedia article and square root(s) of 2 by 2 matrices. If we define the square root of the determinant of $$C$$ as $$s = \sqrt{C_{11}C_{22}-C_{12}^2}$$. Observe that $$C$$ is positive definite which implies that the determinant is positive and that $$s$$ is a positive real number. Furthermore, define $$t = \sqrt{C_{11}+C_{22}+2s}$$. Then we have

$$S = \frac{1}{t} \left[ \begin{array}{cc} C_{11}+s & C_{12} \ C_{12} & C_{22}+s \end{array}\right]$$

The 'standard deviation' matrix is used to draw the ellipses. Essentially, the matrix is used to transform a unit circle. It is also used for the 'add points' functionality where random two numbers are taken from $$N(0,1)$$ to generate a random point $$[x \, y ]$$ and then transformed by multiplicatilon by $$S$$.

You can use the comment system below or send my an email mic.jacobsen@gmail.com