Supp-1: Linear Algebra - 3
2021-1νκΈ°, λνμμ βν΅κ³μ λ°μ΄ν°λ§μ΄λβ μμ μ λ£κ³ 곡λΆν λ°λ₯Ό μ 리ν κΈμ λλ€. μ§μ μ μΈμ λ νμμ λλ€ :)
Set-up
formulas.
1. $\displaystyle UU^T = \sum^n_{i=1} \mathbf{u}_i \mathbf{u}_i^T$, where $U = \left( \mathbf{u}_1, \dots, \mathbf{u}_n \right) \in \mathbb{R}^{n \times n}$
2. $\displaystyle UDU^T = \sum^n_{i=1} d_i \mathbf{u}_i \mathbf{u}_i^T$, where $D = \text{diag}(d_i)$
νλ ¬κ³±μ νννλ μ μ©ν λ°©μμ΄λ μ΅μν΄μ§λ©΄ μ’λ€! μμΌλ‘ λμ€λ λ΄μ©μμ μμ£Ό λ±μ₯νλ€!
Definition. Orthogonal Matrix
An <orthogonal matrix> $U$ is a matrix such that $UU^T = U^TU = I_n$.
Spectral Decomposition
Theorem. Spectral Decomposition; Eigen Decomposition
For a real symmetric matrix $A \in \mathbb{R}^{n\times n}$, there exists a diagonal matrix $D = \text{diag}(d_1, \dots, d_n)$ and an orthogonal matrix $U$ s.t. $A = UDU^T$.
μ¦, λͺ¨λ symmetric matrixλ $UDU^T$μ ννλ‘ decompose κ°λ₯ν¨μ λ§νλ€!
Note.
- Each $d_i$ is an real eigenvalue of $A$.
- $\mathbf{u}_i$ is the eigenvector associated with $d_i$, where $U = \left( \mathbf{u}_1, \dots, \mathbf{u}_n \right)$.
- $\displaystyle A = UDU^T = \sum^n_{i=1} d_i \mathbf{u}_i \mathbf{u}_i^T$
<Spectral Decomposition>μ κΈ°ννμ μ΄ν΄κ° μλ°λμ΄μΌ νλ€.
λ¨Όμ , orthogonal matrix $U$μ λν΄ $U$λ μ νλ³νμμ μΌμ’ μ rotation matrixμ μν μ νλ€.
\[\mathbf{x} \longrightarrow U \mathbf{x}\]μ΄μ $A\mathbf{x}$λ₯Ό ν΄μνκΈ° μν΄ $UDU^T\mathbf{x}$λ₯Ό λ¨κ³λ³λ‘ ν΄μν΄λ³΄μ.
1. $U^T \mathbf{x}$; νμ μ ν΅ν΄ μ’νμΆ λ³ν
$U^T$λ νμ λ³νμ μννλ€. λ°λΌμ, $e_1$, $e_2$ μ’νκ³ μμ μλ $\mathbf{x}$λ₯Ό $u_1$, $u_2$ μ’νκ³λ‘ νμ λ³ννλ€.
μ΄λ, $u_1$, $u_2$λ $U^T$μ κ° μ΄μΌλ‘μ¨ $A$μ eigen vectorμ΄λ€.
κ·Έλ¦¬κ³ $U^T \mathbf{x}$λ₯Ό κ³μ°ν΄λ³΄λ©΄,
\[U^T \mathbf{x} = \left(u_1, u_2 \right) \cdot \mathbf{x} = \begin{pmatrix} (\mathbf{x}^T u_1) \cdot u_1 \\ (\mathbf{x}^T u_2) \cdot u_2 \end{pmatrix}\]κ° λκ³ , μ΄κ²μ $\mathbf{x}$λ₯Ό $u_1$, $u_2$ μΆμΌλ‘ <projection>ν κ²κ³Ό κ°λ€.
2. $D \left( U^T \mathbf{x} \right)$; μ€μΉΌλΌ κ³±
$D$λ diagnomal matrixμ΄κΈ° λλ¬Έμ $u_1$, $u_2$ λ°©ν₯μ 벑ν°μ λν μ€μΉΌλΌ κ³±μ νλ μν μ΄λ€.
3. $U \left( D U^T \mathbf{x} \right)$; μλμ μ’νμΆμΌλ‘ μ-νμ
λ§μ§λ§μΌλ‘ μ²μμ κ³±ν $U^T$μ μ-νμ λ³νμΈ $U$λ₯Ό μ μ©ν΄ λ€μ $e_1$, $e_2$μ μ’νκ³λ‘ λλλ € μ€λ€.
κ·Έλμ μμ μ 리νλ©΄, μλμ κ°λ€.
\[UDU^T \mathbf{x} = UD \begin{pmatrix} u_1^T \mathbf{x} \\ \vdots \\ u_n^T \mathbf{x} \end{pmatrix} = U \begin{pmatrix} d_1 u_1^T \mathbf{x} \\ \vdots \\ d_n u_n^T \mathbf{x} \end{pmatrix} = d_1 u_1 u_1^T \mathbf{x} + \cdots + d_n u_n u_n^T \mathbf{x}\]Singular Value Decomposition
<Sepctral Theorem>μ λμΉ νλ ¬μ λν΄μ κΈ°μ ν μ 리μλ€. μ΄κ²μ μΌλ°μ μΈ ννμ νλ ¬λ‘ μΌλ°νν κ²μ΄ λ°λ‘ <νΉμ΄κ° λΆν΄ Singular Value Decomposition>μ΄λ€. π
Theorem. Singular Value Decomposition
For any $n \times p$ matrix $A$, there exist matrices $U \in \mathbb{R}^{n\times n}$, $D \in \mathbb{R}^{n\times p}$, and $V \in \mathbb{R}^{p\times p}$ s.t.
\[A = UDV^T\]where
- $D = \text{diag}(d_1, \dots, d_p)$ with $d_i \ge 0$
- $d_j$s are called <singular value> of $A$.
- $U$ is an orthogonal matrix from $AA^T = U(\Sigma\Sigma^T)U^T$
- $V$ is an orthogonal matrix from $A^T A = V(\Sigma^T \Sigma)V^T$
Low Rank Approximation
SVDλ₯Ό μ μ΄μ©νλ©΄, νλ ¬ $A$μ λν <Low Rank Approximation> $A_k$λ₯Ό μ λν μ μλ€.
Let $A = UDV^T$ and assume that $n \gg p$, then
\[A = \begin{bmatrix} \vert & & \vert \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \vert & & \vert \\ \end{bmatrix} \begin{bmatrix} d_1 & & \\ & \ddots \\ & & d_p \\ & O & \end{bmatrix} \begin{bmatrix} - & \mathbf{v}_1^T & - \\ & \vdots & \\ - & \mathbf{v}_p^T & - \\ \end{bmatrix} = d_1 \mathbf{u}_1 \mathbf{v}_1^T + \cdots + d_p \mathbf{u}_p \mathbf{v}_p^T\]μ΄λ, νκΈ°μ νΈμλ₯Ό μν΄ singular value $d_i$μ λν΄ descending orderλ‘ νκΈ°λ₯Ό νλ€.
\[\text{where} \quad d_1 \ge d_2 \cdots \ge d_p\]μ£Όλͺ©ν μ μ μμ μ°λ³μ $\mathbf{u}_i \mathbf{v}_i^T$μ rank-1 matrixλΌλ μ μ΄λ€. μ¦, SVDλ₯Ό νκ² λλ©΄ νλ ¬ $A$λ₯Ό rank-1μ matrixλ‘ λΆν΄ν κ²μ΄ λλ€.
μ΄μ μμ κ²°κ³Όλ₯Ό κ°μ§κ³ Approximation $A_k$λ₯Ό μ λν μ μλ€.
\[A_k = \begin{bmatrix} \vert & & \vert \\ \mathbf{u}_1 & \cdots & \mathbf{u}_k \\ \vert & & \vert \\ \end{bmatrix} \begin{bmatrix} d_1 & & \\ & \ddots & \\ & & d_k \\ & O & \end{bmatrix} \begin{bmatrix} - & \mathbf{v}_1^T & - \\ & \vdots & \\ - & \mathbf{v}_k^T & - \\ \end{bmatrix} = d_1 \mathbf{u}_1 \mathbf{v}_1^T + \cdots + d_k \mathbf{u}_k \mathbf{v}_k^T\]μ¦, $A_k$λ $A$λ₯Ό SVDλ‘ λΆν΄ν΄ ννν κ²μμ μμ $k$κ°μ rank-1 matrixλ₯Ό λͺ¨μ κ²μ΄λ€! μ°λ¦¬κ° $d_i$λ₯Ό λ΄λ¦Όμ°¨μμΌλ‘ μ λ ¬ν΄ νννκΈ° λλ¬Έμ, λ€λ‘ κ°μλ‘ μμ $d_j$λ₯Ό μ»κ² λμ΄ μλ€. κ·Έλμ $A_k$λ μμ $k$κ° $d_i$λ§μ μ νν΄ $A$λ₯Ό κ·Όμ¬ν κ²μ΄λ€!
μ΄λ, <Low Rank Approximation>μ errorλ μλμ κ°μ΄ <Frobenius norm>μΌλ‘ μ μνλ€.
\[\begin{aligned} \| A - A_k \|^2_F &= \text{tr}\left( (A-A_k)^T (A-A_k) \right) \\ &= d_{k+1}^2 + d_{k+2}^2 + \cdots + d_p^2 \end{aligned}\]κ·Έλμ, λ§μ½ λ·λΆλΆμ $d_{k+1}, \cdots d_p$μ ν¬κΈ°κ° μλ€λ©΄, Low Rank Approxλ μΆ©λΆν μ νν κ·Όμ¬λ₯Ό μνν¨μ 보μ₯ν μ μλ€!!
ps. μ΄λ° Low Rank Approxλ μκ°λ³΄λ€ μμ£Ό μ¬μ©λλ κΈ°λ²μ΄λΌκ³ νλ€.
ps. Netflixμ μΆμ² μκ³ λ¦¬μ¦ Contestμμ μ΄ SVDλ₯Ό νμ©ν΄ Low Rank Approxλ₯Ό νλ μΆμ² μκ³ λ¦¬μ¦μ ꡬνν νμ΄ μ°μΉμ μ°¨μ§νλ€κ³ νλ€.
μ§κΈκΉμ§ νλ ¬μ λΆν΄νλ λ κ°μ§ λ°©λ²μΈ <Spectral Decomposition>κ³Ό <Singular Value Decomposition>λ₯Ό μ΄ν΄λ΄€λ€. μ΄ λ κ°λ μ μ΄μ΄μ§λ λ΄μ©μΈ νλ ¬μ <Nonnegative Definite>, <Positive Definite>λ₯Ό μ μν λ λ°νμ΄ λλ€.