MNIST Handwritten Digit Recognition Using CNN Project

Verified

Added on 2022/08/25

AI Summary

This project report details the application of Convolutional Neural Networks (CNNs) for handwritten digit recognition using the MNIST dataset. The project begins with an introduction to CNNs, highlighting their importance in image and pattern recognition, and contrasts them with traditional classification algorithms. The problem definition focuses on the MNIST dataset, comprising 70,000 images of handwritten digits, with 42,000 used for training and 28,000 for testing. The report justifies the use of CNNs, emphasizing their architecture, including convolutional, pooling, and fully connected layers, and their ability to automatically learn spatial hierarchies of features. The project describes the approach, including the use of a 28x28x1 input layer, the application of convolution operations, and the extraction of features. The results show the performance of the CNN model, achieving a maximum accuracy of 95.43% after 40 epochs with a batch size of 250. The report concludes by emphasizing the effectiveness of CNNs in image analysis projects.

Running head: APPLICATION OF CONVOLUTIONAL NEURAL NETWORK
Application of Convolutional Neural Network
Name of the Student
Name of the University
Authors note

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1APPLICATION OF CONVOLUTIONAL NEURAL NETWORK
Introduction
In machine learning’ field, Convolutional Neural Network or CNN is Deep Learning
algorithm that are mainly used in pattern classification, image and recognition. The algorithm is
provided with input image along with assigning importance to the variables. This includes
different learnable weights or biases of various objects that are present in input image and are
able in differentiating one from other.
In pre-processing stage the CNN requires much less time compared with other
traditional classification algorithm. In case of primitive algorithms filters needed to be hand-
engineered by providing enough training, where as in case of Convolutional Neural Network
the algorithm had the ability to learn about the filters for classification. Through the use of the
convolutional neural networks, it can help in Object detection, face recognition with video
analysis by using the segmentation, pattern recognition methods.
Problem definition
For this project the MNIST handwritten digit dataset is chosen. The dataset contains
detailed data about 70,000 images of handwritten digits. The dataset contains images of different
digits that are helpful in training the Convolutional Neural Network and testing the same
network. Out of the 70000 images 42,000 scanned images of digits are utilized for training the
network and remaining 28000 for the testing of the developed CNN model. All the images in the
dataset are utilized in the experiment are grayscale images with size of 28×28 pixel. The dataset
includes 784-dimensional vector about the images pixels as every image is of 28×28 pixels.

2APPLICATION OF CONVOLUTIONAL NEURAL NETWORK
Justification for Using CNN
CNN is one of variants of the neural networks that is used heavily within Computer
Vision’s field. It derives the name from the hidden layers which are in it. CNN’s hidden layers
consist of pooling layers, normalization layers, fully connected layers and convolutional layers
(Dong, Loy & Tang, 2016). It means that in spite of using functions of normal activation,
pooling and convolution functions are used in as the activation functions. Convolution operates
over two signals of 1D or two images of 2D, one as input signal and other as filter on input
image, which produces output image. Pooling is discretization process that is based on sample.
Objective is in down-sampling input representation, which reduces the dimensionality as well as
allows for assumptions that are made about the features that are contained within sub-regions
binned (Jin et. al., 2017). CNN is deep neural network that includes hidden layers which have
pooling and convolution functions along with activation function to introduce non-linearity.
Figure1: Count of images with different labels

3APPLICATION OF CONVOLUTIONAL NEURAL NETWORK
CNN is artificial neural networks’ class which became dominant in computer
vision’s several tasks, is attracting quite interest over variety of the domains, which includes
radiology. CNN could adaptively and automatically learn the features’ spatial hierarchies by
backpropagation through using several building blocks, like pooling layers, totally connected
layers and convolution layers. It is model of deep learning to process data which has grid pattern,
like images that could be inspired by organization of visual cortex as well as designed for
adaptively and automatically learning features’ spatial hierarchies. CNN is mathematical
construct which is generated from three kinds of layers. Pooling and convolution layers perform
extraction of the feature while fully connected layer maps the extracted features to final output
(Nah, Hyun Kim & Mu Lee, 2017). Key role is played by convolution layer in CNN that is
created of mathematical operations’ stack like convolution which is linear operation’s
specialized type. Process for optimizing the parameters like kernels is termed as training that is
performed for minimizing the difference among ground truth labels and outputs through
optimization algorithm known as backpropagation.
Approach for solving the problem
In MNIST dataset, every image within it is 28×28×1. Total quantity of neurons within
input layer would be 28×28 = 784 which could be manageable. However, if image’s size in
1000×1000, this means 106 neurons are required within input layer. It seems that numerous
neurons are needed for the operation. This is ineffective right computationally. For this issue,
CNN is used. CNN extracts image’s features and converts it in lower dimension by not losing the
characteristics. It means only 1000 neurons are required within feedforward neural network’s
first layer (Milletari, Navab & Ahmadi, 2016). Thinking about the images, this is easy in
understanding that this has width and height, hence this would have sense in representing

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

4APPLICATION OF CONVOLUTIONAL NEURAL NETWORK
information contained within it with two dimensional matrix or structure. Images are then
encoded to the colour channels, there is representation of image data into every colour intensity
within colour channel at given point.
By applying the CNN network, 40 epochs with 250 size batch is executed and recorded
the maximum 95.43%.
Information contained in image is intensity of every channel in height and width of
image. Hence, intensity of colour channel at every point could be represented in matrix. Each
image has horizontal and vertical edges that combines actually for forming an image. Operation
of convolution is used along with few filters to detect edges. CNN’s input layer must have image
data. This is represented through three dimensional matrix. Convo layer is known as layer of
feature extractor as images’ features are extracted in this layer (Gatys, Ecker & Bethge, 2016).
In the experiment the loss of data is recorded in this graph;

5APPLICATION OF CONVOLUTIONAL NEURAL NETWORK
Connection is made between part of the image with convo layer for performing operation
of convolution and calculating dot product among filter and receptive field. Result of operation is
output volume’s single integer. The process is repeated till the whole image is gone through.
Output would be input for next layer.
Conclusion
Convolutional Neural Network are very useful in analysis of image analysis projects. In
the above project, while trying to train the network different layers were considered at a random
in specific periodic sequence. The reason behind this was to consider every case as different and
the behave in a different manner during tests. Throughout the project maximum and minimum
accuracies levels are checked with the different hidden layers considering batch of 250. Among
all the tests and epochs, maximum accuracy in was recorded as 95.43%.

6APPLICATION OF CONVOLUTIONAL NEURAL NETWORK
References
Dong, C., Loy, C. C., & Tang, X. (2016, October). Accelerating the super-resolution
convolutional neural network. In European conference on computer vision (pp. 391-407).
Springer, Cham.
Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural
networks. In Proceedings of the IEEE conference on computer vision and pattern
recognition (pp. 2414-2423).
Jin, K. H., McCann, M. T., Froustey, E., & Unser, M. (2017). Deep convolutional neural
network for inverse problems in imaging. IEEE Transactions on Image Processing,
26(9), 4509-4522.
Milletari, F., Navab, N., & Ahmadi, S. A. (2016, October). V-net: Fully convolutional neural
networks for volumetric medical image segmentation. In 2016 Fourth International
Conference on 3D Vision (3DV) (pp. 565-571). IEEE.
Nah, S., Hyun Kim, T., & Mu Lee, K. (2017). Deep multi-scale convolutional neural network for
dynamic scene deblurring. In Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition (pp. 3883-3891).
Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., ... & Wang, Z. (2016).
Real-time single image and video super-resolution using an efficient sub-pixel
convolutional neural network. In Proceedings of the IEEE conference on computer vision
and pattern recognition (pp. 1874-1883).