2021-1ν•™κΈ°, λŒ€ν•™μ—μ„œ β€˜ν†΅κ³„μ  λ°μ΄ν„°λ§ˆμ΄λ‹β€™ μˆ˜μ—…μ„ λ“£κ³  κ³΅λΆ€ν•œ λ°”λ₯Ό μ •λ¦¬ν•œ κΈ€μž…λ‹ˆλ‹€. 지적은 μ–Έμ œλ‚˜ ν™˜μ˜μž…λ‹ˆλ‹€ :)

7 minute read

2021-1ν•™κΈ°, λŒ€ν•™μ—μ„œ β€˜ν†΅κ³„μ  λ°μ΄ν„°λ§ˆμ΄λ‹β€™ μˆ˜μ—…μ„ λ“£κ³  κ³΅λΆ€ν•œ λ°”λ₯Ό μ •λ¦¬ν•œ κΈ€μž…λ‹ˆλ‹€. 지적은 μ–Έμ œλ‚˜ ν™˜μ˜μž…λ‹ˆλ‹€ :)

Eigen value & Eigen vector


Definition. eigen value & eigen vector

For $A \in \mathbb{R}^{n \times n}$, $\lambda \in \mathbb{C}$ is called an <eigen value> of $A$ if

\[A \mathbf{x} = \lambda \mathbf{x} \quad \text{for some} \; \mathbf{x} \in \mathbb{R} \setminus \{\mathbf{0}\}\]

Here, $\mathbf{x}$ is called an <eigen vector> of $A$ associated with $\lambda$.


Note.

$\det (A - \lambda I_n) = 0$ if and only if $\lambda$ is an <eigen value>.

Proof.

$A\mathbf{x} = \lambda\mathbf{x}$μ—μ„œ μ •λ¦¬ν•˜λ©΄ $(A - \lambda I_n) \mathbf{x} = 0$을 μ–»λŠ”λ‹€. μ΄λ•Œ, $\mathbf{x}$의 쑰건에 μ˜ν•΄ $\mathbf{x} \ne \mathbf{0}$이닀. 즉, $\mathbf{x}$λŠ” non-trivial solution을 κ°–λŠ”λ‹€λŠ” 말이닀.

μ–΄λ–€ ν–‰λ ¬ $A$에 λŒ€ν•΄ $A$κ°€ invertible이라면, $A\mathbf{x} = 0$μ—μ„œ $\mathbf{x}$λŠ” trivial solution $\mathbf{0}$λ§Œμ„ μ–»λŠ”λ‹€. λ”°λΌμ„œ, $(A - \lambda I_n) \mathbf{x} = 0$μ—μ„œ $\mathbf{x}$이 non-trivial solution을 가지렀면, $(A - \lambda I_n)$이 non-invertible, 즉 $\det(A - \lambda I_n) = 0$이어야 ν•œλ‹€!


$\det(A-\lambda I_n)$λŠ” $\lambda$에 λŒ€ν•œ $n$-th order polynomial이닀. λ”°λΌμ„œ, $A$λŠ” μ •ν™•νžˆ $n$개의 complex eigenvalues (up to multiplicity)λ₯Ό κ°–λŠ”λ‹€. (Fundamental Theorem of Algebra) 이런 $\det (A - \lambda I_n)$을 <νŠΉμ„± 닀항식 Characteristic Polynomial>이라고 λΆ€λ₯Έλ‹€!


ν–‰λ ¬ $A$λ₯Ό μ„ ν˜•λ³€ν™˜μ˜ κ΄€μ μ—μ„œ μ‚΄νŽ΄λ³Έλ‹€λ©΄, <eigen vector>λŠ” μ„ ν˜•λ³€ν™˜ $A$λ₯Ό μ μš©ν–ˆμ„ λ•Œμ˜ κ²°κ³Όκ°€ μžμ‹ μ˜ μƒμˆ˜λ°°κ°€ λ˜λŠ” 0이 μ•„λ‹ˆ 벑터λ₯Ό λ§ν•œλ‹€.

이런 μ„±μ§ˆμ€ <eigen vector>κ°€ μ„ ν˜•λ³€ν™˜ $A$ μ•„λž˜μ—μ„œ λ°©ν–₯이 보쑴되고, 크기(scale)만 λ³€ν•˜λŠ” λ²‘ν„°μž„μ„ μ˜λ―Έν•œλ‹€.

예λ₯Ό λ“€μ–΄, μ§€κ΅¬μ˜ μžμ „μš΄λ™μ΄λΌλŠ” 3차원 νšŒμ „λ³€ν™˜μ„ 생각할 λ•Œ, 이 νšŒμ „ λ³€ν™˜μ— μ˜ν•΄ λ³€ν•˜μ§€ μ•ŠλŠ” <eigen vector>λŠ” νšŒμ μΆ•μ΄ 되며, κ·ΈλŒ€μ˜ <eigen value>λŠ” $1$이 λœλ‹€!


<eigen vector>와 <eigen value>λŠ” λ’€μ—μ„œ μ‚΄νŽ΄λ³Ό <Eigen Decomposition> λ‹€λ₯Έ 말둜 <Spectral Decomposition>μ—μ„œ μ£Όμš”ν•˜κ²Œ μ‚¬μš©λœλ‹€. μ΄λ•Œ, λͺ¨λ“  정방행렬에 λŒ€ν•΄ <Eigen Decomposition>이 κ°€λŠ₯ν•œ 것은 μ•„λ‹ˆλ‹€. 이것이 κ°€λŠ₯ν•˜λ €λ©΄, $n\times n$ ν–‰λ ¬ $A$κ°€ $n$개의 linearly indenpendent eigen vectorλ₯Ό κ°€μ§ˆ λ•Œλ§Œ κ°€λŠ₯ν•˜λ‹€! (쑰건이 κ½€ κΉŒλ‹€λ‘­λ‹€β€¦!)

πŸ‘‰ Spectral Decomposition



Definition. Trace

The sum of all diagonal elements of $A \in \mathbb{R}^{n\times n}$ is called the <trace> of $A$, denoted $\text{tr}(A)$.


Properties.

  • $\text{tr}(A) = \text{tr}(A^T)$
  • $\text{tr}(A+B) = \text{tr}(A) + \text{tr}(B)$
  • $\text{tr}(AB) = \text{tr}(BA)$, for $A \in \mathbb{R}^{n\times p}$, $B \in \mathbb{R}^{p\times n}$ 🎈
  • $\text{tr}(AA^T) = \text{tr}(A^TA)$
  • $\displaystyle \text{tr}(A) = \sum^n_{i=1} \lambda_i$, where $\lambda_i$’s are eigenvalues of $A$. 🎈

🎈 이λͺ¨μ§€κ°€ 뢙은 μ„±μ§ˆμ€ 특히 μ€‘μš”ν•œ μ„±μ§ˆμž…λ‹ˆλ‹€! μœ λ„ 과정을 κΌ­ μ•Œμ•„μ•Ό ν•©λ‹ˆλ‹€!


Matrix Calculus

일차원 λ³€μˆ˜μ— λŒ€ν•΄μ„œ ν•˜λ˜ <Calculus>λ₯Ό 더 높은 μ°¨μ›μ—μ„œ ν•˜λŠ” 것이 <Vector Calculus> λ˜λŠ” <Matrix Calculus>이닀.


Definition. Vector Calculus

Let $\mathbf{a} \in \mathbb{R}^{p \times 1}$ represents a mapping $f: \mathbb{R}^p \rightarrow \mathbb{R}$. Then, $\mathbf{a}^T \mathbf{x} = a_1 x_1 + a_2 x_2 \cdots + a_p x_p$

\[\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{a}^T \mathbf{x} \right) \overset{\text{def}}{=} \left( \frac{\partial}{\partial x_i} \left( \mathbf{a}^T \mathbf{x}\right) \right)_{1 \le i \le p} = \mathbf{a}\]

μœ„μ˜ 상황은 scalar인 $\mathbf{a}^T \mathbf{x}$λ₯Ό vector $\mathbf{x}$둜 λ―ΈλΆ„ν•˜λŠ” 상황이닀!


Definition. Matrix Calculus

Let $A \in \mathbb{R}^{n\times p}$ represents a mapping $f: \mathbb{R}^p \rightarrow \mathbb{R}^n$. Then

\[A \mathbf{x} = \begin{pmatrix} a_{11} x_1 + a_{12} x_2 + \cdots a_{1p} x_p \\ a_{21} x_1 + a_{22} x_2 + \cdots a_{2p} x_p \\ \vdots \\ a_{n1} x_1 + a_{n2} x_2 + \cdots a_{np} x_p \\ \end{pmatrix} = \begin{pmatrix} \mathbf{a}_1^T \mathbf{x} \\ \mathbf{a}_2^T \mathbf{x} \\ \vdots \\ \mathbf{a}_n^T \mathbf{x} \\ \end{pmatrix}\]

1. vector-by-vector

\[\frac{\partial}{\partial \mathbf{x}} \left( A \mathbf{x} \right) \overset{\text{def}}{=} \left( \frac{\partial}{\partial \mathbf{x}} \left( \mathbf{a}_i^T \mathbf{x} \right) \right)_{1 \le i \le n} = \left( \mathbf{a}_i^T \right)_{1 \le i \le n} = A\]

2. <Quadratic Form> Calculus

Let $A \in \mathbb{R}^{n \times n}$, then A <Quadratic Form> $\mathbf{x}^T A \mathbf{x}$ is

\[\mathbf{x}^T A \mathbf{x} = \sum^n_{i=1} x_i \left( A \mathbf{x}\right)_i = \sum^n_{i=1} x_i \left( \sum^n_{j=1} a_{ij} x_j \right) = \sum^n_{i=1} \sum^n_{j=1} a_{ij} x_i x_j\]

Take derivative on $\mathbf{x}^T A \mathbf{x}$

\[\begin{aligned} \frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right) &\overset{\text{def}}{=} \left( \frac{\partial}{\partial x_i} \left( \mathbf{x}^T A \mathbf{x} \right) \right)_{1 \le i \le n} \\ &= \left( \frac{\partial}{\partial x_k} \left( \sum^n_{i=1} \sum^n_{j=1} a_{ij} x_i x_j \right) \right)_{1 \le k \le n} \end{aligned}\]

μ΄λ•Œ, μ†μœΌλ‘œ μŠ€μΌ€μΉ˜ν•΄μ„œ 확인해보면, $\displaystyle\frac{\partial}{\partial x_k} \left(\mathbf{x}^T A \mathbf{x}\right)$λŠ” $\mathbf{x}^T A \mathbf{x}$λ₯Ό λ‚˜μ—΄ ν–ˆμ„ λ•Œ $k$번째 ν–‰, $k$번째 μ—΄ 뢀뢄을 λ―ΈλΆ„ν•˜λŠ” ν–‰μœ„λ‹€. κ·Έλž˜μ„œ 이것을 잘 μ •λ¦¬ν•˜λ©΄, μ•„λž˜μ˜ κ²°κ³Όλ₯Ό μ–»λŠ”λ‹€.

\[\frac{\partial}{\partial x_k} \left(\mathbf{x}^T A \mathbf{x}\right) = \sum^n_{i=1} a_{ik} x_i + \sum^n_{j=1} a_{kj}x_j = \left(A^T\mathbf{x}\right)_k + \left(A\mathbf{x}\right)_k\]

λ”°λΌμ„œ, $\displaystyle\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right)$은

\[\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right) = \left( \left(A^T\mathbf{x}\right)_k + \left(A\mathbf{x}\right)_k \right)_{1\le k \le n} = A^T \mathbf{x} + A \mathbf{x} = \left( A^T + A \right) \mathbf{x}\]

μ—¬κΈ°μ„œ λ§Œμ•½ $A$κ°€ symmetric matrix라면 μ•„λž˜μ™€ κ°™λ‹€.

\[\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^T A \mathbf{x} \right) = \left( A^T + A \right) \mathbf{x} = 2A \mathbf{x}\]

2. <Quadratic Form> Calculus - (2)

μ΄λ²ˆμ—” $\mathbf{x}$둜 ν•œλ²ˆλ” 미뢄을 ν•΄λ³΄μž!

\[\frac{\partial^2}{\partial \mathbf{x} \partial \mathbf{x^T}} \left( \mathbf{x}^T A \mathbf{x} \right) \overset{\text{def}}{=} \left( \frac{\partial^2}{\partial x_j \partial x_i} \left( \mathbf{x}^T A \mathbf{x} \right) \right)_{1 \le i, \, j \, \le n} = \frac{\partial}{\partial \mathbf{x}} \left( \left( A^T + A \right) \mathbf{x} \right) = (A^T + A)\]

μ΄μ–΄μ§€λŠ” λ‚΄μš©μ—μ„  <Spectral Decomposition>와 <Singular Value Decomposition> λ“± μ„ ν˜• λ³€ν™˜μ˜ 쒀더 λ”₯ν•œ λ‚΄μš©μ„ 닀룬닀! 🀯

πŸ‘‰ Supp-1: Linear Algebra - 3