]>

Least squares method of data approximation

Data fitting by approximation involves obtaining a function that “comes close to” the data points, but does not necessarily pass through all data points. In an earlier lesson you did this by “eye-balling” the data and drawing a line through it. Here we answer the question: what is meant by “comes close to” by using mathematical concepts. We will derive the algorithm for the linear least squares method that is commonly used for data fitting.

The basis of the linear least squares method is this: for each data point, we look at the difference between the data point and the approximating line, square that difference, and minimize the sum of all such squares. We know that minimization problems usually involve taking a derivative of a function and finding the values where the resulting function is zero.

Suppose you have a set of data points ( x k , y k ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaeWaaeaacaWG4bWaaSbaaSqaaiaadUgaaeqaaOGaaiilaiaadMhadaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaaaaa@3C6D@ and you want to approximate this data with a linear function y ( x ) = a x + b MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9iaadggacaWG4bGaey4kaSIaamOyaaaa@3E23@ that comes close to the data points. You must therefore find the values of a and b that best fit the data. First compute the error between the actual data points and the fitting function

e k = y k ( a x k + b ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyzamaaBaaaleaacaWGRbaabeaakiabg2da9iaadMhadaWgaaWcbaGaam4AaaqabaGccqGHsisldaqadaqaaiaadggacaWG4bWaaSbaaSqaaiaadUgaaeqaaOGaey4kaSIaamOyaaGaayjkaiaawMcaaaaa@426F@

Then the sum of the errors for all the data points will be

z = e 1 2 + e 2 2 + e 3 2 + + e n 2 = k = 1 n [ y k ( a x k + b ) ] 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOEaiabg2da9iaadwgadaqhaaWcbaGaaGymaaqaaiaaikdaaaGccqGHRaWkcaWGLbWaa0baaSqaaiaaikdaaeaacaaIYaaaaOGaey4kaSIaamyzamaaDaaaleaacaaIZaaabaGaaGOmaaaakiabgUcaRiabl+UimjabgUcaRiaadwgadaqhaaWcbaGaamOBaaqaaiaaikdaaaGccqGH9aqpdaaeWbqaamaadmaabaGaamyEamaaBaaaleaacaWGRbaabeaakiabgkHiTmaabmaabaGaamyyaiaadIhadaWgaaWcbaGaam4AaaqabaGccqGHRaWkcaWGIbaacaGLOaGaayzkaaaacaGLBbGaayzxaaWaaWbaaSqabeaacaaIYaaaaaqaaiaadUgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aaaa@5B2B@

The goal is to find the values of a and b that minimize z. Taking the partial derivative of z with respect to a (holding b constant) yields

z a | b = 0 = k = 1 n 2 [ y k ( a x k + b ) ] x k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaqGaaeaadaWcaaqaaiabgkGi2kaadQhaaeaacqGHciITcaWGHbaaaaGaayjcSdWaaSbaaSqaaiaadkgaaeqaaOGaeyypa0JaaGimaiabg2da9maaqahabaGaeyOeI0IaaGOmamaadmaabaGaamyEamaaBaaaleaacaWGRbaabeaakiabgkHiTmaabmaabaGaamyyaiaadIhadaWgaaWcbaGaam4AaaqabaGccqGHRaWkcaWGIbaacaGLOaGaayzkaaaacaGLBbGaayzxaaGaamiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaaa@5522@

Doing the same for b yields

z b | a = 0 = 2 k = 1 n [ y k ( a x k + b ) ] MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaqGaaeaadaWcaaqaaiabgkGi2kaadQhaaeaacqGHciITcaWGIbaaaaGaayjcSdWaaSbaaSqaaiaadggaaeqaaOGaeyypa0JaaGimaiabg2da9iabgkHiTiaaikdadaaeWbqaamaadmaabaGaamyEamaaBaaaleaacaWGRbaabeaakiabgkHiTmaabmaabaGaamyyaiaadIhadaWgaaWcbaGaam4AaaqabaGccqGHRaWkcaWGIbaacaGLOaGaayzkaaaacaGLBbGaayzxaaaaleaacaWGRbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaaa@5314@

Dividing each equation by -2 and factoring out the unknowns a and b gives us

a k = 1 n x k 2 + b k = 1 n x k = k = 1 n x k y k a k = 1 n x k + b n = k = 1 n y k MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGceaqabeaacaWGHbWaaabCaeaacaWG4bWaa0baaSqaaiaadUgaaeaacaaIYaaaaaqaaiaadUgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aOGaey4kaSIaamOyamaaqahabaGaamiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoakiabg2da9maaqahabaGaamiEamaaBaaaleaacaWGRbaabeaakiaadMhadaWgaaWcbaGaam4AaaqabaaabaGaam4Aaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaakeaacaWGHbWaaabCaeaacaWG4bWaaSbaaSqaaiaadUgaaeqaaaqaaiaadUgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aOGaey4kaSIaamOyaiaad6gacqGH9aqpdaaeWbqaaiaadMhadaWgaaWcbaGaam4AaaqabaaabaGaam4Aaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaaaaa@691F@

These are two equations in two unknowns, a and b. The coefficients appear to be strange but they are known quantities because they consist of the data values ( x k , y k ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaeWaaeaacaWG4bWaaSbaaSqaaiaadUgaaeqaaOGaaiilaiaadMhadaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaaaaa@3C6D@ . So we rewrite these two equations and two unknowns as

( k = 1 n x k 2 ) a + ( k = 1 n x k ) b = ( k = 1 n x k y k ) ( k = 1 n x k ) a + ( n ) b = ( k = 1 n y k ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGceaqabeaadaqadaqaamaaqahabaGaamiEamaaDaaaleaacaWGRbaabaGaaGOmaaaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaOGaayjkaiaawMcaaiaadggacqGHRaWkdaqadaqaamaaqahabaGaamiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaOGaayjkaiaawMcaaiaadkgacqGH9aqpdaqadaqaamaaqahabaGaamiEamaaBaaaleaacaWGRbaabeaakiaadMhadaWgaaWcbaGaam4AaaqabaaabaGaam4Aaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaakiaawIcacaGLPaaaaeaadaqadaqaamaaqahabaGaamiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaOGaayjkaiaawMcaaiaadggacqGHRaWkdaqadaqaaiaad6gaaiaawIcacaGLPaaacaWGIbGaeyypa0ZaaeWaaeaadaaeWbqaaiaadMhadaWgaaWcbaGaam4AaaqabaaabaGaam4Aaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaakiaawIcacaGLPaaaaaaa@725F@

These two equations and two unknowns can now be solved for a and b using the standard approach.

This mathematical approach can be extended to include other approximating functions besides a linear function. However, the derivation for these least squares algorithms is beyond the scope of this lesson.