Computer Vision
[๋ชฉ์ฐจ]
Preliminary
- Linear Algebra
- Bayesโ Theorem
Image Classification
Image Processing Introduction
- Convolution
- smoothing
- Bilinear
- Average
- Gaussian
- smoothing
- Edge
- Gradient Image
- Corners: Harris Corner Detection
- Eigen Decomposition
- Blob
- Laplace of Gaussian; LoG
Image Classification + CV
- SIFT (2004)
- Finding Scale-Space Extrema
- Keypoint Filtering
- Orientation Assignment
- Calculating Descriptor
- Spatial Pytramid Matching (2006)
- Discrimative vs. Geneartive Model
Image Classification + DL
- MLP
- Loss Functions
- Gradient Descent
- SGD
- Momentum
- CNN
- Overfitting Issue
- Drop out
- Weight decay
- Early Stopping
- Network Initialization
- Learning from scratch
- Xavier Initialization (2010)
- He Initialization (2015)
CNN Architectures
- LeNet (1998)
- AlexNet
- LRN; Local response Normalizatioin
- VGGNet (2014)
- ResNet (2016)
- Degrading Problem
- Skip Connection
- Batch Normalization (2015)
- Beyond ResNet
- DenseNet (2017)
- Channel-wise concatenation
- SENet (2017)
- Squeeze & Excitation
- DenseNet (2017)
Object Detection
- Support Vector Machine
- Linear SVM + Separable Case
- Linear SVM + Non-Separable Case
- Soft margin
- Non-Linear SVM
- Kernel Method
- Multi-Class SVM
- Pedestrian Detection + SVM (2005)
- HOG Histogram of Orientated Gradient+ SVM
- R-CNN Region-base CNN (2014)
- Object proposal
- Selective Search
- Object proposal
- Fast R-CNN (2015)
- ROI pooling
- Faster R-CNN (2015)
- Fast R-CNN + RPN Region Proposal Network
Semantic Segmentation
- Fully Convolutional Network (FCN Family)
- FCN (2015)
- DeepLab
- Convolutional Encoder-Decoders
- U-Net
- DeConvNet (2015)
- FCN (2015)
- 1x1 conv
- adding skip connection
- DeepLab (2017)
- Atrous Convolution
- CRF; Fully-Connected Conditional Random Field
- Pyramid Scene Parsing Network (2017)
- Pyramid pooling module
- Context Encoding Network (2018)
- Attention module
- DeConvNet (2015)
- conv - deconv
- pooling - unpooling
Instance-aware Semantic Segmentation
- Multi-task Network Cascades (2016)
- Multi-scale Patch Aggregation (2016)
- Mask R-CNN (2017)
- ROI Align
Metric Learning
- Pairwise & Triplet Metric
- Mahalanobis Distance
- A first approach to distance metric learning (Pairwise)
- Large Margin Nearest Neighbor(LMNN) (Triplet)
- Metric Learning + DL
Video Vision
Video Classification + CV
- Optical Flow
- (๊ฐ์ ) Color constancy
- (๊ฐ์ ) Small motion
- Lukas-Kanade Flow
- STIP; Space-Time Interest Point (2005)
- Dense Trajectory
Video Classification + DL
- 3D CNN (2010)
- C3D (2015)
- Time Information Fusion (2014)
- Sing Frame
- Late Fusion
- Early Fusion
- Slow Fusion
- Two-Stream Cconvolutional Network (2014)
Visual Tracking
- Probabilistic Tracking
- Sequential Density Estimation
- Kalman Filter
- Particle Filtering
Model Fitting
- Least Square
- Ordinary Linear Least Square
- Total Linear Least Square
- RANSAC RANdom SAmple Consensus
- Hough Transform
Camera Models
- 2D Objects
- 2D Transformations
- Translation
- Euclidean transform
- Similarity transform
- Affine transform
- Projective transform
- 3D Objects
- homogeneous coordinates; $\overline{x} = [x, y, z, 1]$
- Pinhole Model
- Intrinsic Parameters
- Extrinsic Parameters
- Camera Clibration
- Estimate camera parameters matrix