ResNet (code)
λ³Έ κΈμ 2020-2νκΈ° βμ»΄ν¨ν° λΉμ β μμ μ λ£κ³ , μ€μ€λ‘ νμ΅νλ©΄μ κ°μΈμ μΈ μ©λλ‘ μ 리ν κ²μ λλ€. μ§μ μ μΈμ λ νμμ λλ€ :)
pyTorchλ₯Ό μ΄μ©ν΄ ResNetμ ꡬνν github/KellerJordanμ μ½λμ λν κ°μΈμ μΈ λΆμκΈμ
λλ€.
ResNet32μ ꡬνν κ²μΌλ‘ μΆμ λ©λλ€.
picture from Pablo Ruiz's blog
ResNet20 μ½λ λΆμ
ResNet20
class ResNet(nn.Module):
...
self.layers1 = self._make_layer(n, 16, 16, 1)
...
def _make_layer(self, layer_count, channels, channels_in, stride):
return nn.Sequential(
ResBlock(channels, channels_in, stride, ...),
*[ResBlock(channels) for _ in range(layer_count-1)])
class ResBlock(nn.Module):
...
ResNet λͺ¨λ νλλ§ λ§λ€μ΄ λͺ¨λΈμ ꡬμΆν κ²μ΄ μλλΌ ResBlock λͺ¨λμ λ§λ€μ΄ μ¬μ©ν μ μ΄ λμ λλ€.
μ¦, nn.Moduleμ μμ λ°μ λͺ¨λ λ΄λΆμ λ λ€λ₯Έ λͺ¨λμ μ¬μ΄μ λͺ¨λΈ ꡬ쑰λ₯Ό λμμΈν μ μμμ 보μ¬μ€λ€! (κ΅³μ΄ λ°μ§μλ©΄, dependencyλ₯Ό λΆμ¬νλ€λ λ§)
ResBlock λͺ¨λμ nn.Sequentialλ₯Ό μ΄μ©ν΄ μ΄μ΄λΆμλ€.
def _make_layer(self, layer_count, channels, channels_in, stride):
return nn.Sequential(
ResBlock(channels, channels_in, stride, ...),
*[ResBlock(channels) for _ in range(layer_count-1)])
[ResBlock(channels) for _ in range(layer_count-1)] μ΄ λΆλΆμ 보면 μ μ μλ― λ΄λΆμ μμΉν ResBlockμμ μ±λμκ° μ μ§λλ€.
μ½λμμλ layer_countλ‘ λ³μκ°μΌλ‘ μ§μ λμ΄ μλλ°, default κ°μ 5λΌκ³ νλ€.
κ·Έλμ _make_layer ν¨μλ μ±λμλ₯Ό λ λ°°λ‘ λ리λ ResBlockκ³Ό μ±λμκ° μ μ§λλ 4κ°μ ResBlockμ μμ±νλ€.
κ° ResBlockμ 2κ°μ conv layerλ₯Ό κ°λλ°, λ°λΌμ _make_layerκ° 10κ°μ conv layerλ₯Ό μμ±ν¨μ μ μ μλ€.
λ, [ResBlock(channels) for _ in range(layer_count-1)]λ inline forλ¬Έμ μ±μ©ν΄ μ½λλ₯Ό κ²½λν νλ€.
κ·Έλ¦¬κ³ nn.Sequential() λ΄λΆμ *[]λ₯Ό μ¬μ©νλλ°, μ€μ λ‘ list νμ
μ *λ₯Ό λΆμ¬μ nn.Sequential()μ μ λ¬ν μ μλ€κ³ νλ€. μλλ μμ μ½λ
import torch.nn as nn
net = nn
layers = [nn.Linear(2, 2), nn.Linear(2, 2)]
net = nn.Sequential(*layers)
print(net)
μ΄ ResNet μ½λλ _make_layer() ν¨μλ₯Ό μΈλ² μ λ νΈμΆνλ€.
class ResNet(nn.Module):
def __init__(self, ...):
...
self.layers1 = self._make_layer(n, 16, 16, 1)
self.layers2 = self._make_layer(n, 32, 16, 2)
self.layers3 = self._make_layer(n, 64, 32, 2)
...
λ€μ΄μ΄κ·Έλ¨μΌλ‘ ννν ꡬ쑰μ μ½λλ₯Ό λΉκ΅ν΄λ³΄μ.
picture from Pablo Ruiz's blog
class ResNet(nn.Module):
def forward(self, x):
out = self.conv1(x)
out = self.norm1(out)
out = self.relu1(out)
out = self.layers1(out) # in: 16, out: 16
out = self.layers2(out) # in: 16, out: 32
out = self.layers3(out) # in: 32, out: 64
out = self.avgpool(out)
out = out.view(out.size(0), -1)
out = self.linear(out) # in: 64, out: 10
return out
ResNetμ μ΄ conv layer μλ₯Ό λ°μ§λ©΄,
1 + (10 + 10 + 10) + 1 = 32
κ·Έλμ μ΄ μ½λλ ResNet32λ₯Ό ꡬνν κ²μ΄λ€!
ResBlock
ResNetμ κ½μ skip connectionμ΄ κ΅¬νλ ResBlock λΆλΆμ΄λ€.
class ResBlock(nn.Module):
...
def forward(self, x):
residual = x # store residual
out = self.conv1(x)
out = self.bn1(out)
out = self.relu1(out)
out = self.conv2(out)
out = self.bn2(out)
out += residual # skip connection!
out = self.relu2(out)
return out
λ³ΈμΈμ μ΄ μ½λλ₯Ό λ³΄κ³ λμμΌ λΉλ‘μ ResNetμ΄ μμ ν μ΄ν΄κ° λμλ€ γ γ
μ°Έκ³ λ‘ ResBlockμμ μ¬μ©λ layer μλ 2κ°μ΄λ€.
residual projection options
μ΄ κ΅¬νμμ residualμ λ°λ‘ λνλ κ² μλλΌ self.projectionμ νλ² κ±°μΉκ² νλ μ΅μ
λ ꡬνμ νλ€.
class ResBlock(nn.Module):
def __init__(self, num_filters, ...):
...
if res_option == 'A':
self.projection = IdentityPadding(num_filters, channels_in, stride)
elif res_option == 'B':
self.projection = ConvProjection(num_filters, channels_in, stride)
elif res_option == 'C':
self.projection = AvgPoolPadding(num_filters, channels_in, stride)
...
def forward(self, x):
...
if self.projection: # residual projection!
residual = self.projection(x)
...
κ°κ° 2μ°¨μμ residual μ΄λ―Έμ§λ₯Ό μ²λ¦¬νλ μ΅μ
λ€λ‘
residualμ΄λ―Έμ§λ₯Ό κ·Έλλ‘ λ³΄λ΄κΈ°λ νκ³ ;IdentityPadding()residualμ΄λ―Έμ§λ₯Ό Convolution νκΈ°λ νκ³ ;ConvProjection()residualμ΄λ―Έμ§λ₯Ό Average Pooling νκΈ°λ νλ€;AvgPoolPadding()
residual projection μ΅μ λ€μ λν λ μμΈν λ΄μ©μ μ΄ λ§ν¬λ₯Ό ν΅ν΄ νμΈν μ μλ€!
KellerJordanμ ResNetμ λͺ¨λΈ ꡬνμ κΉλνκ² μ ν΄λμ΄μ μ λ§ μ’μ μ½λλΌκ³ μκ°νλ€ γ γ
ResNetμ ꡬνν λλ€λ₯Έ μ½λλ μλ€.
μ΄ μ½λμμ ResNet18, ResNet34, ResNet50, ResNet101, ResNet152κΉμ§ λͺ¨λ ꡬνλμ΄ μλ€.
μ΄ μ½λλ λͺ¨λ λΆλ¦¬λ₯Ό μ ν΄λμ΄ κΉλν νΈμ΄μ§λ§, μ£Όμμ΄ λΆμ‘±ν μ μ΄ μμ½λ€.