Inception-v4, Inception-ResNet and the Impact of Residual...
이전 GoogLeNet(Inception V1, V2, and V3)
Pure Inception Blocks: Figure 9
Residual Inception Blocks: Figure 15
Overall Structure

Stem

Inception-A

Inception-B

Inception-C

Reduction-A

Reduction-B

Architecture in a Large Frame(Base on Paper)
| Layer | Filter | Filter Size | Stride | Padding | Size of Feature Map |
|---|---|---|---|---|---|
| Input | 299 x 299 x 3 | ||||
| Stem | 35 x 35 x 384 | ||||
| Inception-A | #Times: 4 | 35 x 35 x 384 | |||
| Reduction-A | 17 x 17 x 1024 | ||||
| Inception-B | # Times: 7 | 17 x 17 x 1024 | |||
| Reduction-B | 8 x 8 x 1536 | ||||
| Inception-C | # Times: 3 | 8 x 8 x 1536 | |||
| Average Pooling | 8 x 8 | 1 x 1 x 1536 | |||
| Dropout | # Rate = 0.8 | 1 x 1536 | |||
| Softmax | 1 x 1000 |
Architecture Stem
| Layer | Filter | Filter Size | Stride | Padding | Size of Feature Map |
|---|---|---|---|---|---|
| Input | 299 x 299 x 3 | ||||
| Convolution 1 | 32 | 3 x 3 | 2 | - | 149 x 149 x 32 |
| Convolution 2 | 32 | 3 x 3 | 1 | - | 147 x 147 x 32 |
| Convolution 3 | 64 | 3 x 3 | 1 | 1 | 147 x 147 x 64 |
| Max Pool 1 | |||||
| (Concatenate: Convolution 4) | - | 3 x 3 | 2 | - | 73 x 73 x 64 |
| Convolution 4 | |||||
| (Concatenate: Max Pool 1) | 96 | 3 x 3 | 2 | - | 73 x 73 x 96 |
| Filter Concat 1 | |||||
| (Max Pool 1 + Convolution 4) | 64 + 96 | ||||
| = 160 | 73 x 73 x 160 | ||||
| Convolution 5 | |||||
| (Concatenate: Convolution 7→10) | 64 | 1 x 1 | 1 | - | 73 x 73 x 64 |
| Convolution 6 | 96 | 3 x 3 | 1 | - | 71 x 71 x 96 |
| Convolution 7 | |||||
| (Concatenate: Convolution 5→ 6) | 64 | 1 x 1 | 1 | - | 73 x 73 x 64 |
| Convolution 8 | 64 | 7 x 1 | 1 | (3, 1) | 73 x 73 x 64 |
| Convolution 9 | 64 | 1 x 7 | 1 | (1, 3) | 73 x 73 x 64 |
| Convolution 10 | 96 | 3 x 3 | 1 | - | 71 x 71 x 96 |
| Filter Concat 2 | |||||
| (Convolution 6 + Convolution 10) | 96 + 96 | ||||
| = 192 | 71 x 71 x 192 | ||||
| Convolution 11 | |||||
| (Concatenate: Max Pool 3) | 192 | 3 x 3 | 2 | - | 35 x 35 x 192 |
| Max Pool 3 | |||||
| (Concatenate: Convolution 11) | - | 3 x 3 | 2 | 35 x 35 x 192 | |
| Filter Concat 3 | |||||
| (Convolution 11 + Max Pool 3) | 192 + 192 | ||||
| = 384 | 35 x 35 x 384 |
Architecture Inception-A
| Layer | Filter | Filter Size | Stride | Padding | Size of Feature Map |
|---|---|---|---|---|---|
| Filter Concat | 35 x 35 x 384 | ||||
| Average Pool 1 | |||||
| (Branch 1) | - | 1 x 1 | 1 | - | 35 x 35 x 384 |
| Convolution 1 | |||||
| (Branch 1) | 96 | 1 x 1 | 1 | - | 35 x 35 x 96 |
| Convolution 2 | |||||
| (Branch 2) | 96 | 1 x 1 | 1 | - | 35 x 35 x 96 |
| Convolution 3 | |||||
| (Branch 3) | 64 | 1 x 1 | 1 | - | 35 x 35 x 64 |
| Convolution 4 | |||||
| (Branch 3) | 96 | 3 x 3 | 1 | 1 | 35 x 35 x 96 |
| Convolution 5 | |||||
| (Branch 4) | 64 | 1 x 1 | 1 | - | 35 x 35 x 64 |
| Convolution 6 | |||||
| (Branch 4) | 96 | 3 x 3 | 1 | 1 | 35 x 35 x 96 |
| Convolution 7 | |||||
| (Branch 4) | 96 | 3 x 3 | 1 | 1 | 35 x 35 x 96 |
| Filter Concat | |||||
| (Convolution 1 + 2 + 4 + 7) | 96 + 96 + 96 + 96 | ||||
| = 384 | 35 x 35 x 384 |