Convolutional Neural Network

1. Depthwise Separable Convolutions
- 1.1. How it work?
- 1.2. Should the first layer for RGB input be DepthWise Separable Convolution?

Convolution
Visualize convolutional layer: https://ezyang.github.io/convolution-visualizer/

1. Depthwise Separable Convolutions

Depthwise Separable Convolutions can act as drop in replacement for Conv2D and improve the accuracy while reducing the number of parameters and operations.

This layer performs a spatial convolution on each channel of its input, independently, before mixing output channels via a pointwise convolution (a 1 × 1 convolution).

Patches Are All You Need? paper seems to be doing the same thing. But has GeLUs, Batchnorms and residual connection in the mix with both depthwise and pointwise convolution. i.e. depthwise convolution doesn't immediately follow pointwise convolution.

1.1. How it work?

(from Deep Learning with Python - François Chollet pg. 258)

It is equivalent to separating the learning of spatial features and the learning of channel-wise features.
the assumption is that spatial locations in intermediate activations are highly correlated, but different channels are highly independent.
this assumption serves as a useful prior that helps the model make more efficient use of its training data.

1.2. Should the first layer for RGB input be DepthWise Separable Convolution?

No. Because the assumption that underlies separable convolution, “feature channels are largely independent,” does not hold for RGB images! Red, green, and blue color channels are actually highly correlated in natural images.

As such, the first layer has to be the usual Conv2D layer.

Convolutional Neural Network

Table of Contents

1. Depthwise Separable Convolutions

1.1. How it work?

1.2. Should the first layer for RGB input be DepthWise Separable Convolution?