Least squares method of data approximation
Data fitting by approximation involves obtaining a function that “comes close to” the data points, but does not necessarily pass through all data points. In an earlier lesson you did this by “eye-balling” the data and drawing a line through it. Here we answer the question: what is meant by “comes close to” by using mathematical concepts. We will derive the algorithm for the linear least squares method that is commonly used for data fitting.
The basis of the linear least squares method is this: for each data point, we look at the difference between the data point and the approximating line, square that difference, and minimize the sum of all such squares. We know that minimization problems usually involve taking a derivative of a function and finding the values where the resulting function is zero.
Suppose you have a set of data points
and you want to approximate this data with a linear function
that comes close to the data points. You must therefore find the values of a and b that best fit the data. First compute the error between the actual data points and the fitting function
Then the sum of the errors for all the data points will be
The goal is to find the values of a and b that minimize z. Taking the partial derivative of z with respect to a (holding b constant) yields
Doing the same for b yields
Dividing each equation by -2 and factoring out the unknowns a and b gives us
These are two equations in two unknowns, a and b. The coefficients appear to be strange but they are known quantities because they consist of the data values
. So we rewrite these two equations and two unknowns as
These two equations and two unknowns can now be solved for a and b using the standard approach.
This mathematical approach can be extended to include other approximating functions besides a linear function. However, the derivation for these least squares algorithms is beyond the scope of this lesson.