Linear Algebra


  • A vector vRnv \in \mathbb{R}^n is a tuple of nn real numbers.
  • A matrix ARm×nA \in \mathbb{R}^{m \times n} is a collection of m×nm \times n real numbers.

Operations

  • Vector
    • +,,,(α) +, -, \odot, (\alpha \cdot) are element wise operations.
    • Norm: v=i=1nvi2||v|| = \sqrt{\sum_{i=1}^n v_i^2}
    • Inner product: vw=i=1nviwiv \cdot w = \sum_{i=1}^n v_i \cdot w_i = vwcos(θ)||v|| \cdot ||w|| \cdot \cos(\theta)
    • Cross product:
    v×w=(v1,v2,v3)×(w1,w2,w3)=(v2w3v3w2,v3w1v1w3,v1w2v2w1)v×w=vwsin(θ)v \times w = (v_1, v_2, v_3) \times (w_1, w_2, w_3) \\ = (v_2 \cdot w_3 - v_3 \cdot w_2, v_3 \cdot w_1 - v_1 \cdot w_3, v_1 \cdot w_2 - v_2 \cdot w_1) \\ ||v \times w|| = ||v|| \cdot ||w|| \cdot \sin(\theta)
    • Cross product is non-commutative.
  • Matrix
    • +,,,(α) +, -, \odot, (\alpha \cdot) are element wise operations.
    • Matrix multiplication: ABA \cdot B: Look it up
    • Matrix transpose: $A^T: Interchange rows and columns.
    • Trace of a matrix: tr(A)=i=1naiitr(A) = \sum_{i=1}^n a_{ii}
    • Determinant of a 3x33x3 matrix: det(A)det(A): det([abcdefghi])=a(eifh)b(difg)+c(dheg)\det(\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}) = a (e * i - f * h) - b (d * i - f * g) + c (d * h - e * g)
    • Minor of a matrix Mi,jM_{i, j}: The minor of matrix is for each element of matrix and is equal to the determinant of part of the matrix remaining after excluding the row and the column containing that particular element.
    • Cofactor of a minor Ci,j=(1)i+jMi,jC_{i, j} = (-1)^{i+j} M_{i, j}
    • Adjoint of a matrix Adj(A)=[Ci,j]TAdj(A) = [C_{i, j}]^T where Ci,jC_{i, j} is cofactor of Ai,jA_{i, j}
    • Inverse of a matrix A1=1det(A)Adj(A)A^{-1} = \frac{1}{det(A)} Adj(A)
      • Inverse doesnt exist when det(A)=0    Rank of matrix is not full, One of the row/column is a linear of combination of the other row/column respectively.det(A) = 0 \implies \text{Rank of matrix is not full, One of the row/column is a linear of combination of the other row/column respectively.}
  • Matrix-Vector
    • y=Axy = A \cdot x: Multiplication of matrix AA by vector xx produces a linear combination of columns of AA. This introduces the concept of Linear Transformation

TODO: Add properties of different types of matrices (diagonal, symmetric, etc) TODO: Add notes on solving linear equations

Linear Transformation

Applying a matrix AA to a vector xx can be considered as a linear transformation of vector. A linear transformation is a mapping which preserves the following properties:

  • A(x+y)=Ax+AyA (x + y) = A x + A y
  • A(αx)=αAxA (\alpha x) = \alpha A x

TODO: Add notes about translation, rotation, scaling, reflection?

Eigenvalues and Eigenvectors

For a square matrix AA, the eigenvalues λ\lambda and eigenvectors vv of AA are such that Av=λvA v = \lambda v. In other words, transforming vv by AA is equivalent to scaling vv by λ\lambda - the eigenvalue, so the direction of vv is preserved.

  • Sum of eigenvalues is the trace of the matrix tr(A)tr(A).
  • Product of eigenvalues is the determinant of the matrix det(A)det(A).
  • If the matrix is invertible, the eigenvalues are the roots of the characteristic polynomial det(AλI)=0det(A - \lambda I) = 0.
  • Eigen values of A1A^{-1} are the inverse of eigenvalues of AA, but eigen vectors are same.
  • Eigen values of ATA^T are the same as eigenvalues of AA.
  • If the matrix is singular, det(A)=0\det(A) = 0, atleast one eigenvalue is 0.
  • For a symmetric matrix AA, the eigenvalues are real and the eigenvectors are orthogonal.

Diagonalization

A square matrix AA is said to be diagonalizable if there exists a matrix PP such that A=PDP1A = PDP^{-1} where DD is a diagonal matrix.

Consider the a matrix P=[v1v2]P = \begin{bmatrix} v_1 & v_2 & \ldots \end{bmatrix}, where v1,v2,v_1, v_2, \ldots are linearly independent eigenvectors of AA (If eigen values are unique, the corresponding eigenvectors are linearly independent). Then by definition,

AP=A[v1v2]=[λ1v1λ2v2]=[v1v2][λ100λ2]=PDTaking P1 on both sides, APP1=PDP1A=PDP1A \cdot P = A \cdot \begin{bmatrix} v_1 & v_2 & \ldots \end{bmatrix} \\ = \begin{bmatrix} \lambda_1 v_1 & \lambda_2 v_2 & \ldots \end{bmatrix} \\ = \begin{bmatrix} v_1 & v_2 & \ldots \end{bmatrix} \cdot \begin{bmatrix} \lambda_1 & 0 & \ldots \\ 0 & \lambda_2 & \ldots \\ \vdots & \vdots & \ddots \end{bmatrix} \\ = P \cdot D \\ \text{Taking } P^{-1} \text{ on both sides, } \\ A \cdot P \cdot P^{-1}= P \cdot D \cdot P^{-1}\\ A = P \cdot D \cdot P^{-1}

Therefore, AA is diagonalizable using linearly independent eigenvectors and diagonal matrix of eigen values. If AA is not diagonalizable, its called Defective Matrix. One reason is if eigen values are not unique the eigen vectors might not be linearly independent. (TODO: More information on this is confusing)

Inverse using eigen decomposition

If AA can be eigen-decomposed, and none of eigen values are zero, the AA is invertible and A1=PD1P1A^{-1} = P \cdot D^{-1} \cdot P^{-1}

Singular Value Decomposition

A generalisation of eigen decomposition to an m×nm \times n matrix AA.

SVD is factorization of a real or complex matrix into a rotation matrix, followed by scaling matrix and then another rotation matrix. For a matrix MRm×nM \in \mathbb{R}^{m \times n}, SVD is given by M=UΣVTM = U \Sigma V^T where URm×m,ΣRm×n,VRn×nU \in \mathbb{R}^{m \times m}, \Sigma \in \mathbb{R}^{m \times n}, V \in \mathbb{R}^{n \times n}. Here UU and VV are complex unitary (complex version of orthogonal) matrices and Σ\Sigma is a diagonal matrix of non-negative real numbers.

  • The entries of Σ\Sigma are called singular values of MM.
  • The columns of UU are called left singular vectors of MM.
  • The columns of VV are called right singular vectors of MM.
  • The singular values of MM are (usually) sorted in non-decreasing order.
  • SVD is not unique.

Understanding SVD through Eigen decomposition

For a matrix AA, we want to find U,V,ΣU, V, \Sigma such that A=UΣVTA = U \Sigma V^T.

Consider AATAA^T, which is a symmetric and positive semi-definite matrix. We can write the eigen decomposition as ATA=VΛVTA^TA = V \Lambda V^T, where VV is an orthogonal matrix and Λ\Lambda is a diagonal matrix of eigen values of ATAA^TA. And the eigen values λi\lambda_i are non-negative (or positive?). And thus, for any (unit) eigen vector viv_i, we have ATAvi=λiviA^TAv_i = \lambda_i v_i.

Lets consider σi=λi\sigma_i = \sqrt{\lambda_i} to be the diagonal elements of Σ\Sigma. These are called singular values of AA. Next lets define ui=Aviσiu_i = \frac{Av_i}{\sigma_i}, because AviAv_i transforms unit orthogonal vectors to another set of orthogonal vectors. But these have to be normalised.

ATAvi=λiviATAvi=σi2viA^TA v_i = \lambda_i v_i \\ A^TAv_i = \sigma_i^2 v_i \\ Avi=(Avi)TAviAvi=viTATAviAvi=viT(σi2vi)Avi=σi2viAvi=σi2 ||A v_i|| = (Av_i)^T Av_i\\ ||A v_i|| = v_i^T A^T Av_i\\ ||A v_i|| = v_i^T (\sigma_i^2 v_i)\\ ||A v_i|| = \sigma_i^2 ||v_i||\\ ||A v_i|| = \sigma_i^2 \\

So we can take the another set of orthogonal unit vectors ui=Aviσiu_i = \frac{Av_i}{\sigma_i}

Thus, Avi=σiuiAv_i = \sigma_i u_i and considering for the entire matrix, AV=UΣAV = U \Sigma. Taking the inverse of VV on both sides. We get A=UΣV1=UΣVTA = U \Sigma V^{-1} = U \Sigma V^T

TODO: The intuition behind coming up with u=Aviσiu = \frac{Av_i}{\sigma_i} has to be checked.