Prev: L33, Next: L35

# Lecture

📗 The lecture is in person, but you can join Zoom: 8:50-9:40 or 11:00-11:50. Zoom recordings can be viewed on Canvas -> Zoom -> Cloud Recordings. They will be moved to Kaltura over the weekends.
📗 The in-class (participation) quizzes should be submitted on TopHat (Code:741565), but you can submit your answers through Form at the end of the lectures too.
📗 The Python notebooks used during the lectures can also be found on: GitHub. They will be updated weekly.


# Lecture Notes

📗 Least Squares Regression
➭ If the label \(y\) is continuous, it can still be predicted using \(\hat{f}\left(x'\right) = w_{1} x'_{1} + w_{2} x'_{2} + ... + w_{m} x'_{m} + b\).
scipy.linalg.lstsq(x, y) can be used to find the weights \(w\) and the bias \(b\): Doc.
➭ It computes the least-squares solution to \(X w = y\), or the \(w\) such that \(\left\|y - X w\right\|\) = \(\displaystyle\sum_{i=1}^{n} \left(y_{i} - w_{1} x_{i1} - w_{2} x_{i2} - ... - w_{m} x_{im} - b\right)^{2}\) is minimized.
sklearn.linear_model.LinearRegression performs the same linear regression.

Item Input (Features) Output (Labels) -
1 \(\left(x_{11}, x_{12}, ..., x_{1m}\right)\) \(y_{1} \in \mathbb{R}\) training data
2 \(\left(x_{21}, x_{22}, ..., x_{2m}\right)\) \(y_{2} \in \mathbb{R}\) -
3 \(\left(x_{31}, x_{32}, ..., x_{3m}\right)\) \(y_{3} \in \mathbb{R}\) -
... ... ... ...
n \(\left(x_{n1}, x_{n2}, ..., x_{nm}\right)\) \(y_{n} \in \mathbb{R}\) used to figure out \(y \approx \hat{f}\left(x\right)\)
new \(\left(x'_{1}, x'_{2}, ..., x'_{m}\right)\) \(y' \in \mathbb{R}\) guess \(y' =  \hat{f}\left(x\right)\)




📗 Design Matrix
➭ \(X\) is a matrix with \(n\) rows and \(m + 1\) columns, called the design matrix, where each row of \(X\) is a list of features of a training item plus a \(1\) at the end, meaning row \(i\) of \(X\) is \(\left(x_{i1}, x_{i2}, x_{i3}, ..., x_{im}, 1\right)\).
➭ The transpose of \(X\), denoted by \(X^\top\), flips the matrix over its diagonal, which means each column of \(X^\top\) is a training item with a \(1\) at the bottom.

📗 Matrix Inversion
➭ \(X w = y\) can be solved using \(w = y / X\) (not proper notation) or \(w = X^{-1} y\) only if \(X\) is square and invertible.
➭ \(X\) has \(n\) rows and \(m\) columns so it is usually not square and thus not invertible.
➭ \(X^\top X\) has \(m + 1\) rows and \(m + 1\) columns and is invertible if \(X\) has linearly independent columns (the features are not linearly related).
➭ \(X^\top X w = X^\top y\) is used instead of \(X w = y\), which can be solved as \(w = \left(X^\top X\right)^{-1} \left(X^\top y\right)\).

📗 Matrix Inverses
scipy.linalg.inv(A) can be used to compute the inverse of A: Doc.
scipy.linalg.solve(A, b) can be used to solve for \(w\) in \(A w = b\) and is faster than computing the inverse: Doc.
➭ The reason is that computing the inverse is effectively solving \(A w = e_{1}\), \(A w = e_{2}\), ... \(A w = e_{n}\), where \(e_{j}\) is the vector with \(1\) as position \(j\) and \(0\) everywhere else, for example, \(e_{1} = \begin{bmatrix} 1 \\ 0 \\ 0 \\ ... \end{bmatrix}\), \(e_{2} = \begin{bmatrix} 0 \\ 1 \\ 0 \\ ... \end{bmatrix}\), \(e_{3} = \begin{bmatrix} 0 \\ 0 \\ 1 \\ ... \end{bmatrix}\) ...

Grade Regression Example ➭ Find the linear relationship between exam 1 and exam 2 grades.
➭ Code for simple linear regression: Notebook.
➭ Code for multiple regression: Notebook.



📗 LU Decomposition
➭ A square matrix \(A\) can be written as \(A = L U\), where \(L\) is a lower triangular matrix and \(U\) is an upper triangular matrix.
➭ For example, if \(A\) is 3 by 3, then \(\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix} = \begin{bmatrix} l_{11} & 0 & 0 \\ l_{21} & l_{22} & 0 \\ l_{31} & l_{32} & l_{33} \end{bmatrix} \begin{bmatrix} u_{11} & u_{12} & u_{13} \\ 0 & u_{22} & u_{23} \\ 0 & 0 & u_{33} \end{bmatrix}\).
➭ Sometimes, a permutation matrix is required to reorder the rows of \(A\), so \(P A = L U\) is used, where \(P\) is a permutation matrix (reordering of the rows of the identity matrix \(I\)).
scipy.linalg.lu(A) can be used to find \(P, L, U\) matrices: Doc.

📗 LU Decomposition Solve
➭ Solving \(A w = b\) and \(A w = c\) involves computing the same LU decomposition for \(A\) twice.
➭ It is faster to compute the LU decomposition once and then solve using the LU matrices instead of \(A\).
scipy.linalg.lu_factor(A) can be used to find the \(L, U\) matrices: Doc.
scipy.linalg.lu_solve((lu, p), b) can be used to solve \(A w = b\) where lu is the LU decomposition and p is the permutation.

📗 Comparison for Solving Multiple Systems
➭ To solve \(A w = b\), \(A w = c\) for square invertible \(A\):

Method Procedure Speed comparison
1 inv(A) @ b then inv(A) @ c Slow
2 solve(A, b) then solve(A, c) Fast
3 lu, p = lu_factor(A) then lu_solve((lu, p), b) then lu_solve((lu, p), c) Faster


➭ When \(A = X^\top X\) and \(b = X^\top y\), solving \(A w = b\) leads to the solution to the linear regression problem. If the same features are used to make predictions for different prediction variables, it is faster to use lu_solve.



📗 Numerical Instability
➭ Division by a small number close to 0 may lead to inaccurate answers.
➭ Inverting or solving a matrix close to 0 could lead to inaccurate solutions too.
➭ A matrix being close to 0 is usually defined by its condition number, not determinant.
numpy.linalg.cond can be used find the condition number: Doc.
➭ Larger condition number means the solution can more inaccurate.

TopHat Invertibility Discussion ➭ Code to invert matrices: Notebook.
➭ Discuss what should be the solution and why Python computes it incorrectly.

📗 Multicollinearity
➭ In linear regression, large condition number of the design matrix is related to multicollinearity.
➭ Multicollinearity occurs when multiple features are highly linearly correlated.
➭ One simple rule of thumb is that the regression has multicollinearity if the condition number of larger than 30.




📗 Notes and code adapted from the course taught by Yiyin Shen Link and Tyler Caraza-Harter Link






Last Updated: April 29, 2024 at 1:10 AM