Building a Handwritten Multi-Digit Calculator | by Neerav Gala | Dec, 2020

[ad_1]


Using Convolutional Neural Networks (Keras API with Tensorflow backend)

Photo by Markus Spiske on Unsplash
  • Set up the mathematical expression and compute the answer
  • If incorrect, update the model with the correct answer
  • Convolutional Neural Networks (refer to this youtube video by MIT 6.S191)
  • Keras API
  1. Introduction
  2. Creating the CNN Model
  3. Model Predictions
  4. Creating the Calculator
  5. Model Update
  6. Conclusion

1.1 What is CNN?

Convolutional Neural Networks are a subclass of Deep Learning algorithms mainly used for analyzing visual imagery. High amounts of training data, increasing computational powers and advanced deep learning techniques have paved the way for CNNs to perform complex visual tasks [1].

  • Google Photos for Image Classification [4]
  • Facebook Artificial Intelligence Research (FAIR) for Language Translation [5]

1.2 Object

The objective of this article is to show you how to build a CNN model that can do the following:

Image by author

2.1 Preparing the dataset

The dataset used for this project can be found in here (file name is dataset.csv). It is an extension of the MNIST dataset and contains 85,709 images of Arabic numerals (0 to 9) and mathematical operators (+, – , * and /). Each image is of size 28 X 28 pixels. In order to be easily encoded, the mathematical operators are numerically labelled as follows:

  • 11 represents “+” (addition)
  • 12 represents “-” (subtraction)
  • 13 represents “/” (multiplication)
Image by author

2.1 Building and training CNN Model

The CNN model is build using Keras’ Sequential model class. The table below shows the list of layers (and hyperparameters) passed to create the model.

Image by author
  1. Next, the Max Pooling layer (pooling size of 2 X 2) simply downsamples the filters to reduce computational load, memroy usage and number of parameters.
  2. Finally, a Dropout is used as regularization method. It randomly ignore 25% of the nodes during every training iteration.
  1. Max Pooling layer (pooling size of 2 X 2) for downsampling.
  2. Dropout for regularization.
  1. Next, a fully-connected (Dense) layer that acts as an Artificial Neural Network.
  2. Dropout for regularization.
  3. Finally, another fully-connected (Dense) layers with Softmax activation as the output layer. The softmax function takes the elements of the output layer and transforms them into a net output distribution of the probability of each class. The class with the highest probability is taken as the model prediction.
OUTPUT[]:
Epoch 1/5
- 412s - loss: 0.2837 - accuracy: 0.9143 - val_loss: 0.0606 - val_accuracy: 0.9831
Epoch 2/5
- 439s - loss: 0.0776 - accuracy: 0.9771 - val_loss: 0.0348 - val_accuracy: 0.9901
Epoch 3/5
- 410s - loss: 0.0622 - accuracy: 0.9818 - val_loss: 0.0292 - val_accuracy: 0.9919
Epoch 4/5
- 413s - loss: 0.0578 - accuracy: 0.9837 - val_loss: 0.0285 - val_accuracy: 0.9916
Epoch 5/5
- 410s - loss: 0.0540 - accuracy: 0.9845 - val_loss: 0.0322 - val_accuracy: 0.9917
Image by author

3.1 Element Separation

To analyze the mathematical expression, it first needs to broken down into individual elements which can be then identified by the model.

Image by author
Image by author
In []: len(out)
Out[]: 14
Image by author
Image by author
In []: elements_array.shape
Out[]: (14, 28, 28, 1)

3.2 Prediction of elements

The elements array is passed through the model for prediction. The model returns the probabilities of all the number classes (Softmax function). The class with the highest probability is chosen.

In []: print(elements_pred)Out[]: [ 9  8 10  7  6 13  5  4 11  3  2 12  1  0]

Once all the individual elements are identified, it is time to create the mathematical expression and calculate it.

In []: m_exp_str = math_expression_generator(elements_pred)In []: print(m_exp_str)Out[]: 98 / 76 * 54 + 32 - 10
In []: print(equation)Out[]: 98 / 76 * 54 + 32 - 10 = 91.63

What makes Machine learning algorithms so powerful is its ability to learn from its mistakes. In order to do that the model needs to be retrained with correct information.

The article aims to teach you the basics of how CNN models can be used to make simple tools like the calculator. In the future, these basic concepts can help you perform complex visualization analysis and build sophisticated models.

  • Full code can be found here.

[1] ‘Convolutional neural network’, Wikipedia. Dec. 03, 2020, Accessed: Dec. 09, 2020. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Convolutional_neural_network&oldid=992073762.

Read More …

[ad_2]


Write a comment