Suppose we have a collection of points in the plane and we want to draw a
straight line through these points, in such a way that the line is a "best
fit".
An equation for a straight line can always be set up as follows:
$$
\cos(\theta) (x - p) + \sin(\theta) (y - q) = 0
$$
Here $\theta$ is the angle of the line's normal with the x-axis and $(p,q)$
is a point supporting the line. The distance of an arbitrary point $\vec{r}_k
= (x_k,y_k)$ to the line is given by the length of the projection of the point's
vector $\vec{r}_k$ onto the line. The normal $\vec{n}$ of the line is given by
$\vec{n} = (\cos(\theta),\sin(\theta))$. Hence the length of the projection is:
$$
\left| \frac{(\vec{r}_k \cdot \vec{n})}{(\vec{n} \cdot \vec{n})} \vec{n} \right| =
\left| \cos(\theta) (x_k - p) + \sin(\theta) (y_k - q) \right|
$$
For the straight line to be a "best fit", it will be required that the sum of
the weighted squares of all distances to the line shall be a minimum:
$$
\sum_k w_k \left[ \cos(\theta) (x_k - p) + \sin(\theta) (y_k - q) \right]^2
= \mbox{minimum}(p,q,\theta)
$$
Here the weights $w_k$ are chosen in such a way that $\sum_k w_k = 1$ .
Working out a bit:
$$
\cos^2(\theta) \sum_k w_k (x_k - p)^2 +
\sin^2(\theta) \sum_k w_k (y_k - q)^2 +
$$ $$
2 \sin(\theta) \cos(\theta) \sum_k w_k (x_k - p) (y_k - q)
= \mbox{minimum}(p,q,\theta)
$$
Let us solve just one part of the puzzle, namely: how the points $(p,q)$
must be selected in such a way that a minimum may be reached with respect to
this choice. For certain parts of the above expression this would mean that:
$$
\sum_{k} w_k (x_k - p)^2 = \mbox{minimum} \quad \mbox{and} \quad
\sum_{k} w_k (y_k - q)^2 = \mbox{minimum}
$$
It is sufficient to consider only the expression in $x$. Differentiate to $p$:
$$
\frac{d}{dp} \left[\sum_k w_k x_k^2 - 2 p \sum_k w_k x_k + p^2 \sum_k w_k
\right] = - 2 \sum_k w_k x_k + 2 p \sum_k w_k = 0
$$ $$
\quad \Longrightarrow \quad p = \mu_x = \sum_k w_k x_k
$$
And in very much the same way:
$$
q = \mu_y = \sum_k w_k y_k
$$
Define second order momenta with respect to the midpoint as usual:
$$
\sigma_{xx} = \sum_k w_k (x_k - \mu_x)^2 \quad \mbox{and} \quad
\sigma_{yy} = \sum_k w_k (y_k - \mu_y)^2
$$ $$
\sigma_{xy} = \sum_k w_k (x_k - \mu_x) (y_k - \mu_y)
$$
We can concentrate now on minimalization with respect to the angle $\theta$:
$$
\cos^2(\theta) \sigma_{xx} + \sin^2(\theta) \sigma_{yy}
+ 2 \sin(\theta) \cos(\theta) \sigma_{xy} = \mbox{minimum}(\theta)
$$
Extreme values may be found by differentiation to the independent variable:
$$
- 2 \sin(\theta) \cos(\theta) \sigma_{xx} + 2 \cos(\theta) \sin(\theta) \sigma_{yy}
+ 2 \cos^2(\theta) \sigma_{xy} - 2 \sin^2(\theta) \sigma_{xy} = 0
$$
Which leads to the equation:
$$
- \sin(2 \theta) (\sigma_{xx} - \sigma_{yy}) + \cos(2 \theta) 2 \sigma_{xy} = 0
$$
A solution is:
$$
\tan(2\theta) = \frac{2 \sigma_{xy}}{\sigma_{xx} - \sigma_{yy}}
\quad \Longrightarrow \quad \theta =
\frac{1}{2} \left[ \arctan\left(\frac{2 \sigma_{xy}}{\sigma_{xx} - \sigma_{yy}}\right) + k\cdot\pi \right]
$$
with $k = 0,1,2, \cdots$ . But the extreme value can still be a minimum or a maximum.
In order to decide about this, we have to evaluate the expression:
$$
M(\theta) = \cos^2(\theta) \sigma_{xx} + \sin^2(\theta) \sigma_{yy}
+ 2 \sin(\theta)\cos(\theta) \sigma_{xy}
$$
for two different values of $\,\theta$ , namely with $\,k=0,1$ .