An introduction
to Convolutional Neural Networks
Introduction:
What is an artificial neural network?
In recent times, words like deep learning, machine learning and artificial intelligence have become so common that even school kids are now somewhat familiar with these terms. The advent of Machine Learning has been followed by the rise of Artificial Neural Networks (ANNs). ANNs are computational processing systems used to handle a large amount of data. They take inspiration from the way biological nervous systems operate. These neural networks have a number of hidden layers stacked upon each other. The basic computational units of a neural network are called neurons. Just like neurons in the human brain, these fundamental blocks take in input signals, process them and produce the output. Figure 1 below shows the basic structure of any ANN architecture:
Figure 1
What
are convolutional neural networks?
In this article, we will focus on
Convolutional Neural Networks (CNNs). CNNs are similar to ANNs, but they are
used to perform tasks such as image processing and pattern recognition within
images. An image is nothing but a two dimensional signal that can be
represented in the form of a matrix. While setting up the CNN architecture, one
must take into consideration the fact that the input to such systems consists
of images.
Methodology:
A CNN comprises three important layers
which can be stacked together to form the CNN architecture. These are
convolutional layers, pooling layers and fully connected layers.
1. Convolutional
layer
This layer is based on the
linear mathematical operation of convolution in which two signals are
multiplied to produce a third signal. When data hits this layer, convolution
takes place between the input and a filter of particular size. The output of
this layer is in the form of a 2D activation map which gives information about
the image itself.
2. Pooling
layer
The convolutional layer is
followed by a pooling layer. The primary aim of this layer is to reduce the
computational complexity of the model and to make it more cost effective. Thus
the pooling layer is destructive in nature and reduces the dimensionality of
the representation.
3. Fully
connected layer
The fully connected layer
consists of neurons that are connected directly to the neurons of the two
adjacent layers. In the previous layers, the input image is flattened and fed
to the fully connected layer. In this layer, mathematical functions operate and
classification of the image takes place.
Figure 2 below shows the
structure of the layers of a CNN which have been described above.
Figure 2
Observation:
Despite the fact that CNNs
require a relatively small number of layers,there is no set way for formulating
a CNN architecture. The common architecture includes stacking of convolutional
layers, followed by pooling and fully connected layers. Another practice is to
stack multiple convolutional layers before the pooling layer so as to handle
more complex features. Also to reduce the computational complexity, large
convolutional layers are split into smaller ones. CNNs are very powerful
machine learning algorithms but can be resource heavy too. Hence to address
this problem, the spatial dimensionality of the input images is reduced.
Conclusion:
Thus to conclude, we can
say that convolutional neural networks focus on only a specific type of input
and hence, it is easier to set up the architecture for the same. Applications
of CNNs include research in the field of image analysis which range from image
and video analysis, medical image processing, image classification and computer
vision.
References:
1. https://www.researchgate.net/publication/285164623_An_Introduction_to_Convolutional_Neural_Networks
2. https://www.upgrad.com/blog/basic-cnn-architecture/
Written by,
Mugdha Deshpande
SY EnTC