Supp-1: Linear Algebra - 2
2021-1νκΈ°, λνμμ βν΅κ³μ λ°μ΄ν°λ§μ΄λβ μμ μ λ£κ³ 곡λΆν λ°λ₯Ό μ 리ν κΈμ λλ€. μ§μ μ μΈμ λ νμμ λλ€ :)
Eigen value & Eigen vector
Definition. eigen value & eigen vector
For $A \in \mathbb{R}^{n \times n}$, $\lambda \in \mathbb{C}$ is called an <eigen value> of $A$ if
\[A \mathbf{x} = \lambda \mathbf{x} \quad \text{for some} \; \mathbf{x} \in \mathbb{R} \setminus \{\mathbf{0}\}\]Here, $\mathbf{x}$ is called an <eigen vector> of $A$ associated with $\lambda$.
Note.
$\det (A - \lambda I_n) = 0$ if and only if $\lambda$ is an <eigen value>.
Proof.
$A\mathbf{x} = \lambda\mathbf{x}$μμ μ 리νλ©΄ $(A - \lambda I_n) \mathbf{x} = 0$μ μ»λλ€. μ΄λ, $\mathbf{x}$μ 쑰건μ μν΄ $\mathbf{x} \ne \mathbf{0}$μ΄λ€. μ¦, $\mathbf{x}$λ non-trivial solutionμ κ°λλ€λ λ§μ΄λ€.
μ΄λ€ νλ ¬ $A$μ λν΄ $A$κ° invertibleμ΄λΌλ©΄, $A\mathbf{x} = 0$μμ $\mathbf{x}$λ trivial solution $\mathbf{0}$λ§μ μ»λλ€. λ°λΌμ, $(A - \lambda I_n) \mathbf{x} = 0$μμ $\mathbf{x}$μ΄ non-trivial solutionμ κ°μ§λ €λ©΄, $(A - \lambda I_n)$μ΄ non-invertible, μ¦ $\det(A - \lambda I_n) = 0$μ΄μ΄μΌ νλ€!
$\det(A-\lambda I_n)$λ $\lambda$μ λν $n$-th order polynomialμ΄λ€. λ°λΌμ, $A$λ μ νν $n$κ°μ complex eigenvalues (up to multiplicity)λ₯Ό κ°λλ€. (Fundamental Theorem of Algebra) μ΄λ° $\det (A - \lambda I_n)$μ <νΉμ± λ€νμ Characteristic Polynomial>μ΄λΌκ³ λΆλ₯Έλ€!
νλ ¬ $A$λ₯Ό μ νλ³νμ κ΄μ μμ μ΄ν΄λ³Έλ€λ©΄, <eigen vector>λ μ νλ³ν $A$λ₯Ό μ μ©νμ λμ κ²°κ³Όκ° μμ μ μμλ°°κ° λλ 0μ΄ μλ 벑ν°λ₯Ό λ§νλ€.
μ΄λ° μ±μ§μ <eigen vector>κ° μ νλ³ν $A$ μλμμ λ°©ν₯μ΄ λ³΄μ‘΄λκ³ , ν¬κΈ°(scale)λ§ λ³νλ 벑ν°μμ μλ―Ένλ€.
Image from βλ€ν¬ νλ‘κ·Έλλ¨Έβ
μλ₯Ό λ€μ΄, μ§κ΅¬μ μμ μ΄λμ΄λΌλ 3μ°¨μ νμ λ³νμ μκ°ν λ, μ΄ νμ λ³νμ μν΄ λ³νμ§ μλ <eigen vector>λ νμ μΆμ΄ λλ©°, κ·Έλμ <eigen value>λ $1$μ΄ λλ€!
<eigen vector>μ <eigen value>λ λ€μμ μ΄ν΄λ³Ό <Eigen Decomposition> λ€λ₯Έ λ§λ‘ <Spectral Decomposition>μμ μ£Όμνκ² μ¬μ©λλ€. μ΄λ, λͺ¨λ μ λ°©νλ ¬μ λν΄ <Eigen Decomposition>μ΄ κ°λ₯ν κ²μ μλλ€. μ΄κ²μ΄ κ°λ₯νλ €λ©΄, $n\times n$ νλ ¬ $A$κ° $n$κ°μ linearly indenpendent eigen vectorλ₯Ό κ°μ§ λλ§ κ°λ₯νλ€! (μ‘°κ±΄μ΄ κ½€ κΉλ€λ‘λ€β¦!)
Definition. Trace
The sum of all diagonal elements of $A \in \mathbb{R}^{n\times n}$ is called the <trace> of $A$, denoted $\text{tr}(A)$.
Properties.
- $\text{tr}(A) = \text{tr}(A^T)$
- $\text{tr}(A+B) = \text{tr}(A) + \text{tr}(B)$
- $\text{tr}(AB) = \text{tr}(BA)$, for $A \in \mathbb{R}^{n\times p}$, $B \in \mathbb{R}^{p\times n}$ π
- $\text{tr}(AA^T) = \text{tr}(A^TA)$
- $\displaystyle \text{tr}(A) = \sum^n_{i=1} \lambda_i$, where $\lambda_i$βs are eigenvalues of $A$. π
π μ΄λͺ¨μ§κ° λΆμ μ±μ§μ νΉν μ€μν μ±μ§μ λλ€! μ λ κ³Όμ μ κΌ μμμΌ ν©λλ€!
Matrix Calculus
μΌμ°¨μ λ³μμ λν΄μ νλ <Calculus>λ₯Ό λ λμ μ°¨μμμ νλ κ²μ΄ <Vector Calculus> λλ <Matrix Calculus>μ΄λ€.
Definition. Vector Calculus
Let $\mathbf{a} \in \mathbb{R}^{p \times 1}$ represents a mapping $f: \mathbb{R}^p \rightarrow \mathbb{R}$. Then, $\mathbf{a}^T \mathbf{x} = a_1 x_1 + a_2 x_2 \cdots + a_p x_p$
\[\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{a}^T \mathbf{x} \right) \overset{\text{def}}{=} \left( \frac{\partial}{\partial x_i} \left( \mathbf{a}^T \mathbf{x}\right) \right)_{1 \le i \le p} = \mathbf{a}\]μμ μν©μ scalarμΈ $\mathbf{a}^T \mathbf{x}$λ₯Ό vector $\mathbf{x}$λ‘ λ―ΈλΆνλ μν©μ΄λ€!
Definition. Matrix Calculus
Let $A \in \mathbb{R}^{n\times p}$ represents a mapping $f: \mathbb{R}^p \rightarrow \mathbb{R}^n$. Then
\[A \mathbf{x} = \begin{pmatrix} a_{11} x_1 + a_{12} x_2 + \cdots a_{1p} x_p \\ a_{21} x_1 + a_{22} x_2 + \cdots a_{2p} x_p \\ \vdots \\ a_{n1} x_1 + a_{n2} x_2 + \cdots a_{np} x_p \\ \end{pmatrix} = \begin{pmatrix} \mathbf{a}_1^T \mathbf{x} \\ \mathbf{a}_2^T \mathbf{x} \\ \vdots \\ \mathbf{a}_n^T \mathbf{x} \\ \end{pmatrix}\]1. vector-by-vector
\[\frac{\partial}{\partial \mathbf{x}} \left( A \mathbf{x} \right) \overset{\text{def}}{=} \left( \frac{\partial}{\partial \mathbf{x}} \left( \mathbf{a}_i^T \mathbf{x} \right) \right)_{1 \le i \le n} = \left( \mathbf{a}_i^T \right)_{1 \le i \le n} = A\]2. <Quadratic Form> Calculus
Let $A \in \mathbb{R}^{n \times n}$, then A <Quadratic Form> $\mathbf{x}^T A \mathbf{x}$ is
\[\mathbf{x}^T A \mathbf{x} = \sum^n_{i=1} x_i \left( A \mathbf{x}\right)_i = \sum^n_{i=1} x_i \left( \sum^n_{j=1} a_{ij} x_j \right) = \sum^n_{i=1} \sum^n_{j=1} a_{ij} x_i x_j\]Take derivative on $\mathbf{x}^T A \mathbf{x}$
\[\begin{aligned} \frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right) &\overset{\text{def}}{=} \left( \frac{\partial}{\partial x_i} \left( \mathbf{x}^T A \mathbf{x} \right) \right)_{1 \le i \le n} \\ &= \left( \frac{\partial}{\partial x_k} \left( \sum^n_{i=1} \sum^n_{j=1} a_{ij} x_i x_j \right) \right)_{1 \le k \le n} \end{aligned}\]μ΄λ, μμΌλ‘ μ€μΌμΉν΄μ νμΈν΄λ³΄λ©΄, $\displaystyle\frac{\partial}{\partial x_k} \left(\mathbf{x}^T A \mathbf{x}\right)$λ $\mathbf{x}^T A \mathbf{x}$λ₯Ό λμ΄ νμ λ $k$λ²μ§Έ ν, $k$λ²μ§Έ μ΄ λΆλΆμ λ―ΈλΆνλ νμλ€. κ·Έλμ μ΄κ²μ μ μ 리νλ©΄, μλμ κ²°κ³Όλ₯Ό μ»λλ€.
\[\frac{\partial}{\partial x_k} \left(\mathbf{x}^T A \mathbf{x}\right) = \sum^n_{i=1} a_{ik} x_i + \sum^n_{j=1} a_{kj}x_j = \left(A^T\mathbf{x}\right)_k + \left(A\mathbf{x}\right)_k\]λ°λΌμ, $\displaystyle\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right)$μ
\[\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right) = \left( \left(A^T\mathbf{x}\right)_k + \left(A\mathbf{x}\right)_k \right)_{1\le k \le n} = A^T \mathbf{x} + A \mathbf{x} = \left( A^T + A \right) \mathbf{x}\]μ¬κΈ°μ λ§μ½ $A$κ° symmetric matrixλΌλ©΄ μλμ κ°λ€.
\[\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right) = \left( A^T + A \right) \mathbf{x} = 2A \mathbf{x}\]2. <Quadratic Form> Calculus - (2)
μ΄λ²μ $\mathbf{x}$λ‘ νλ²λ λ―ΈλΆμ ν΄λ³΄μ!
\[\frac{\partial^2}{\partial \mathbf{x} \partial \mathbf{x^T}} \left( \mathbf{x}^T A \mathbf{x} \right) \overset{\text{def}}{=} \left( \frac{\partial^2}{\partial x_j \partial x_i} \left( \mathbf{x}^T A \mathbf{x} \right) \right)_{1 \le i, \, j \, \le n} = \frac{\partial}{\partial \mathbf{x}} \left( \left( A^T + A \right) \mathbf{x} \right) = (A^T + A)\]μ΄μ΄μ§λ λ΄μ©μμ <Spectral Decomposition>μ <Singular Value Decomposition> λ± μ ν λ³νμ μ’λ λ₯ν λ΄μ©μ λ€λ£¬λ€! π€―