Ellipses of Inertia

$ \def \half {\frac{1}{2}} \def \hieruit {\quad \Longrightarrow \quad} \def \slechts {\quad \Longleftrightarrow \quad} \def \EN {\quad \mbox{and} \quad} \newcommand{\dq}[2]{\displaystyle \frac{\partial #1}{\partial #2}} \newcommand{\oq}[2]{\partial #1 / \partial #2} $ A measure for the width of the bell-shaped Gauss curve in one dimension is the spread called $\sigma$ : $$ \sigma = \sqrt{\sigma_{xx}} \qquad \mbox{in} \quad g(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\half (x^2/\sigma^2) } \hieruit g(\sigma) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\half} $$ Analogously, a sensible "width" of the 2-D skewed Gaussian distribution could be conceived as the value of the exponent which is such that: $$ g(x,y) = \frac{ e^{-\half \left[ \sigma_{yy} (x-\mu_x)^2 - 2 \sigma_{xy} (x-\mu_x) (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 \right] / ( \sigma_{xx} \sigma_{yy} - \sigma_{xy}^2 ) }} {2\pi\sqrt{\sigma_{xx}\sigma_{yy}-\sigma_{xy}^2 }} $$ $$ = \frac{1}{2\pi\sqrt{\sigma_{xx}\sigma_{yy}-\sigma_{xy}^2 }} \; e^{-\half} \hieruit $$ $$ \frac{ \sigma_{yy} (x-\mu_x)^2 - 2 \sigma_{xy} (x-\mu_x) (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 }{ \sigma_{xx}\sigma_{yy}-\sigma_{xy}^2 } = 1 $$ Thus, instead of being characterized by a single number, the "size" of the 2-D skewed Gaussian distribution is bound by a curve of the form $F(x,y)=1$. This outcome is far more transparent in eigenvector coordinates, where the same function is written as: $$ (x-\mu_x)^2 / \lambda_1 + (y-\mu_y)^2 / \lambda_2 = 1 $$ Substitute: $$ \begin{array}{lll} x := x-\mu_x & & a^2 := \lambda_1 \\ x := y-\mu_y & & b^2 := \lambda_2 \end{array} $$ Then the equation for the curve $F(x,y)=1$ is: $$ \left( \frac{x}{a} \right)^2 + \left( \frac{y}{b} \right)^2 = 1 $$ This means that the area of dominant values for the skewed 2-D Gauss function is bounded by an ellipse with (its midpoint in the centre) and its (half) axes equal to the square roots of the main moments of inertia. $$ a := \sqrt{\lambda_1} \EN b := \sqrt{\lambda_2} $$ Reason why this ellipse is commonly called the ellipse of inertia. Redefine the function $F(x,y)$ as: $$ F(x,y) = \sigma_{yy} (x-\mu_x)^2 - 2 \sigma_{xy} (x-\mu_x) (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 = \sigma_{xx}\sigma_{yy} - \sigma_{xy}^2 $$ Some general formulas for the (partial) differentiation of implicit functions are: $$ 0 = dF = \dq{F}{x} dx + \dq{F}{y} dy \hieruit \frac{dy}{dx} = - \frac{\oq{F}{x}}{\oq{F}{y}} \EN \frac{dx}{dy} = - \frac{\oq{F}{y}}{\oq{F}{x}} $$ Enabling us to find tangent lines, as follows: $$ \frac{dy}{dx} = 0 \slechts \dq{F}{x} = 0 \qquad \EN \qquad \frac{dx}{dy} = 0 \slechts \dq{F}{y} = 0 $$ Applied to the curve at hand: $$ F(x,y) = \sigma_{yy} (x-\mu_x)^2 - 2 \sigma_{xy} (x-\mu_x) (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 $$ First do $dy/dx$ for finding the horizontal tangents: $$ \dq{F}{x} = 0 \hieruit \sigma_{yy} (x-\mu_x) - \sigma_{xy} (y-\mu_y) = 0 \hieruit (x-\mu_x) = \frac{\sigma_{xy}}{\sigma_{yy}} (y-\mu_y) $$ Substitute into $F(x,y) = \sigma_{xx}\sigma_{yy} - \sigma_{xy}^2$ : $$ \sigma_{yy} \left[ \frac{\sigma_{xy}}{\sigma_{yy}} (y-\mu_y) \right]^2 - 2 \sigma_{xy} \left[ \frac{\sigma_{xy}}{\sigma_{yy}} (y-\mu_y) \right] (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 = \sigma_{xx}\sigma_{yy} - \sigma_{xy}^2 $$ $$ \slechts \frac{ (y-\mu_y)^2 }{\sigma_{yy}} \left[\sigma_{xy}^2 - 2 \sigma_{xy}^2 + \sigma_{xx}\sigma_{yy} \right] = \sigma_{xx}\sigma_{yy} - \sigma_{xy}^2 $$ $$ \slechts \frac{ (y-\mu_y)^2 }{\sigma_{yy}} = 1 \slechts y = \mu_y \pm \sqrt{\sigma_{yy}} $$ $$ \EN (x-\mu_x) = \frac{\sigma_{xy}}{\sigma_{yy}} (\pm \sqrt{\sigma_{yy}}) \hieruit x = \mu_x \pm \frac{\sigma_{xy}}{\sqrt{\sigma_{yy}}} $$ In very much the same way we find the vertical tangents: $$ \frac{dx}{dy} = 0 \slechts x = \mu_x \pm \sqrt{\sigma_{xx}} \EN y = \mu_y \pm \frac{\sigma_{xy}}{\sqrt{\sigma_{xx}}} $$ When visualizing the result, it is seen that the ellipse of inertia is enclosed by a Bounding Box which is defined by the diagonal: $$ (\mu_x-\sqrt{\sigma_{xx}},\mu_y-\sqrt{\sigma_{yy}}) - (\mu_x+\sqrt{\sigma_{xx}},\mu_y+\sqrt{\sigma_{yy}}) $$ Because Gauss functions are expensive to compute, it is desirable to have an estimate of the cut-off value, beyond which the values of Gauss function can be safely set to zero. Such is the case if, beyond certain values for $(x,y)$: $$ e^{ - \half \left[ \sigma_{yy} (x-\mu_x)^2 - 2 \sigma_{xy} (x-\mu_x) (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 \right] / ( \sigma_{xx} \sigma_{yy} - \sigma_{xy}^2 ) } < \epsilon \slechts $$ $$ - \half \frac{ \sigma_{yy} (x-\mu_x)^2 - 2 \sigma_{xy} (x-\mu_x) (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 } { \sigma_{xx} \sigma_{yy} - \sigma_{xy}^2 } < \ln(\epsilon) \slechts $$ $$ \frac{ \sigma_{yy} (x-\mu_x)^2 - 2 \sigma_{xy} (x-\mu_x) (y-\mu_y) + \sigma_{xx} (y-\mu_y)^2 } { \sigma_{xx} \sigma_{yy} - \sigma_{xy}^2 } > 2\, \ln(1/\epsilon) $$ Again, the outcome is far more transparent in eigenvector coordinates, where: $$ (x-\mu_x)^2 / \lambda_1 + (y-\mu_y)^2 / \lambda_2 > 2\, \ln(1 / \epsilon) $$ Substitute: $$ \begin{array}{lll} x := x-\mu_x & & a^2 := \lambda_1 . 2 \, \ln(1/\epsilon) \\ x := y-\mu_y & & b^2 := \lambda_2 . 2 \, \ln(1/\epsilon) \end{array} $$ Then the above condition for the uninteresting area can be written as: $$ \left( \frac{x}{a} \right)^2 + \left( \frac{y}{b} \right)^2 > 1 $$ This means that the area of interesting values for the Gauss function is bounded by still another ellipse of inertia with (half) axes: $$ a := \sqrt{ \lambda_1 . 2 \, \ln(1/\epsilon) } $$ $$ b := \sqrt{ \lambda_2 . 2 \, \ln(1/\epsilon) } $$ Hence it is possible to define a (Pascal) function for the part of the plane where values of the skewed 2-D Gauss function are worthwile to be calculated, under the assumption that the vector $mu$ and the tensor $sigma$ are known globally:
 function interesting(x,y : double) : boolean;
 var
   Det : double;
 begin
   Det := sigma.xx*sigma.yy - sqr(sigma.xy);
   interesting :=
   ( sigma.yy*sqr(x-mu.x) - 2*sigma.xy*(x-mu.x)*(y-mu.x) 
   + sigma.xx*sqr(y-mu.y) < Det * 2*ln(1/epsilon) );
 end;
Sometimes, it may be necessary to have kind of an estimate of the area which is covered by an ellipse of inertia, for the reason that the number of points (i.e.pixels) inside the area of interest is proportional to such an area. One could, for example, carry out a Flood Fill on "black" pixels inside. It is known that the area of an ellipse with axes $a$ and $b$ is given by: $$ \pi \, a.b = \pi \, \sqrt{\lambda_1} \sqrt{\lambda_2} = \pi \, \sqrt{ \lambda_1 \lambda_2 } = \ln(1/\epsilon) \; 2 \pi \, \sqrt{ \sigma_{xx} \sigma_{yy} - \sigma_{xy}^2 } $$ Hence the area of interest is exactly equal to the norming factor, times the natural logarithm of one divided by the desired accuracy. The Bounding Box diagonal of the ellipse of interest is given by: $$ (\mu_x - \sigma_x, \mu_y - \sigma_y) - (\mu_x + \sigma_x, \mu_y + \sigma_y) $$ $$ \mbox{where} \quad \sigma_x := \sqrt{ \sigma_{xx} . 2 \, \ln(1/\epsilon) } \EN \sigma_y := \sqrt{ \sigma_{yy} . 2 \, \ln(1/\epsilon) } $$ Resulting in a BB area which exceeds the area of the ellipse with a factor greater than $ > 8/2\pi $ .