Neural Network Concepts Cover

Convolutional Neural Networks

Python Convolutional Neural Network: Creating a CNN in Keras, TensorFlow and Plain Python

Today, Python is the most common language used to build and train neural networks, specifically convolutional neural networks. In this article, we’ll discover why Python is so popular, how all major deep learning frameworks support Python, including the powerful platforms TensorFlow, Keras, and PyTorch.

We’ll also go through two tutorials to help you create your own Convolutional Neural Networks in Python: 1. building a convolutional neural network in Keras, and 2. creating a CNN from scratch using NumPy. In the end, we’ll discuss convolutional neural networks in the real world.

Why Use Python for Deep Learning?

Python is the language most commonly used today to build and train neural networks and in particular, convolutional neural networks.

Here are a few reasons for its popularity:

  1. The Python syntax makes it easy to express mathematical concepts, so even those unfamiliar with the language can start building mathematical models easily
  2. Python was designed to be easy to use and quick to learn, and has an accessible syntax
  3. There are many Python frameworks and libraries available for machine and deep learning, including NumPy, scikit-learn, as well as the “big three” deep learning frameworks which we discuss in the following section
  4. Python is suitable for collaborative coding and implementation, because its code is readable and easy to convey to others
  5. Python has a large community supporting the language

Python Deep Learning Frameworks

All major deep learning frameworks support Python. Of these, the most popular and powerful platforms are TensorFlow, Keras (which is typically used as a front-end wrapper for TensorFlow), and PyTorch.

Below is a quick description of each of the frameworks, and installation instructions to get you started.


TensorFlow is Google’s open source deep learning framework. It provides powerful capabilities for structuring datasets, working with Tensors (multidimensional arrays, a basic building block of neural networks) and constructing deep learning architectures.


You can work with TensorFlow directly to build new neural network algorithms and finely customize neural network models.


If you want to work with standard neural network models, use TensorFlow with the Keras front end, which is packaged with TensorFlow (more about Keras below).


Downloading current release for CPU-only
pip install tensorflow


Downloading GPU page for CUDA-enabled GPU
pip install tensorflow-gpu


Running TensorFlow container
docker pull tensorflow/tensorflow

docker run -it -p 8888:8888 tensorflow/tensorflow


Pre-installed cloud environmentGoogle Colab is a free cloud-based environment you can use to learn TensorFlow

Keras is an open source deep learning library. It enables fast experimentation by giving developers access to standard neural network models with a simple programming model.


Keras is a high-level deep learning framework which runs on top of TensorFlow, Microsoft Cognitive Toolkit or Theano (but in practice, most commonly used with TensorFlow). Keras provides convenient programming abstractions that let you work with deep learning constructs like models, layers and hyperparameters, not with tensors and matrices.




Installing with TensorFlow vbackend using pip*
pip install --upgrade tensorflow-gpu


Installing with Theano backend on Ubuntu using Anaconda
conda install numpy scipy mkl <nose> <sphinx> <pydot-ng>

conda install theano pygpu


Installing with CNTK backend on Linux using pip*
sudo apt-get install openmpi-bin

pip install cntk-gpu


* Remove “-gpu” to install non-GPU version for beginners



PyTorch is a middle ground between TensorFlow and Keras – it is powerful and allows you to manipulate tensors and lower-level constructs, but is also easy to use and provides convenient abstractions that save time.


It offers a workflow similar to NumPy, and has an imperative runtime model, allowing you to write neural network code in Python and run it immediately to see how it works, rather than wait for the full experiment to run.


PyTorch makes it easy to write your own code without sacrificing versatile and powerful features.





Installing via Anaconda
conda install pytorch torchvision -c pytorch
Installing via pip on Python 3.x
pip3 install torch torchvision
Installing via pip on Python 2.x
pip install torch torchvision
Verifying installation: run this PyTorch code, should display random tensor data
from __future__ import print_function
import torch
x = torch.rand(5, 3)

CNN Python Tutorial #1: Building a Convolutional Neural Network in Keras

In this tutorial you will use Keras to build a CNN that can identify handwritten digits. We’ll use the MNIST dataset of 70,000 handwritten digits (from 0-9).

The tutorial steps below are summarized – for full details and code see the full tutorial by Eijaz Allibhai.

  1. Loading the dataset

Load the training and testing MNIST images into the variables X_train and X_test, with y_train and y_test used to hold the matching digits. Keep in mind that the shape of every image in the MNIST dataset is 28 x 28 pixels.

from keras.datasets import mnist

(X_train, y_train), (X_test, y_test) = mnist.load_data()
  1. Pre-processing

Reshape the inputs (X_train and X_test) to a shape that can be an input for the CNN model. The Keras reshape function takes as arguments the number of images (60,000 for X_train and 10,000 for X_test), the shape of each image (28×28), and the number of color channels – 1 in this case because images are greyscale.

Then, one-hot-encode the target variable, mapping a variable to each target label – in our case, ‘0’, ‘1’, ‘2’, etc. because we are recognizing digits.

X_train = X_train.reshape(60000,28,28,1)

X_test = X_test.reshape(10000,28,28,1)

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)

  1. Building the model

Use the code below to build a CNN model, via the convenient Sequential object in Keras. The model will include:

  • Two “Conv2D” or 2-dimensional convolutional layers, each with a pooling layer following it. The first layer uses 64 nodes, while the second uses 32, and ‘kernel’ or filter size for both is 3 squared pixels.
  • A “flatten” layer that turns the inputs into a vector
  • A “dense” layer that takes that vector and generates probabilities for 10 target labels, using a Softmax activation function.
from keras.models import Sequential

from keras.layers import Dense, Conv2D, Flatten

model = Sequential()

model.add(Conv2D(64, kernel_size=3, activation=’relu’, input_shape=(28,28,1)))

model.add(Conv2D(32, kernel_size=3, activation=’relu’))


model.add(Dense(10, activation=’softmax’))
  1. Compiling the model

Compile the model, providing three parameters:

  • Optimizer – use the ‘adam’ optimize which adjusts learning rate throughout training (read our guide to neural network hyperparameters to understand learning rate)
  • Loss function – use a ‘categorical_crossentropy’ loss function, a common choice for classification. The lower the score, the better the model is performing.
  • Metrics – use the ‘accuracy’ metric to get an accuracy score when the model runs on the validation set.
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
  1. Training the model

Train the model using the keras fit() function, providing the training data, target data, and the number of epochs the experiment should run (the number of times training should be repeated on the data)., y_train, validation_data=(X_test, y_test), epochs=3)
  1. Making predictions

The predict() function returns an array with 10 numbers, these are the probabilities that an image contains each possible digit from 0 to 9. Run a prediction for the first four images in the test set, and display the first four values in y_test to compare to the actual results.



You’ll see that the model was correct – it predicted 7, 2, 1 and 0 for the first four images, which are the correct values in y_test.

CNN Python Tutorial #2: Creating a CNN From Scratch using NumPy

In this tutorial you’ll see how to build a CNN from scratch using the NumPy library. This is considered more difficult than using a deep learning framework, but will give you a much better understanding what is happening behind the scenes of the deep learning process.

The following tutorial steps are summarized – see the full tutorial and code by Ahmed Gad.

  1. Reading the input image

Use this code to read an input image and convert it to grayscale:


img =

img = skimage.color.rgb2gray(img)
  1. Preparing filters

Prepare a filter to convert the image into a shape that can be used by the first convolutional layer. Create a zero array of size of size (2=num_filters, 3=num_rows_filter, 3=num_columns_filter), and two filters of size 3×3, a 2D array because the input image is grayscale and has only 1 color channel.

l1_filter = numpy.zeros((2,3,3))
l1_filter[0, :, :] = numpy.array([[[-1, 0, 1],   
                                   [-1, 0, 1],   
                                   [-1, 0, 1]]])  
l1_filter[1, :, :] = numpy.array([[[1,   1,  1],   
                                   [0,   0,  0],   
                                   [-1, -1, -1]]])
  1. First convolutional layer

Convolve the image by passing the filters over it, using the conv() function.

1.  l1_feature_map = conv(img, l1_filter)

Here is how the filter bank is implemented. It checks if the number of image channels matches the filter depth, if filter dimensions are equal and if the filter has an odd size.

def conv(img, conv_filter):  
    if len(img.shape) > 2 or len(conv_filter.shape) > 3: 
    if img.shape[-1] != conv_filter.shape[-1]:  
            print("Error: Number of channels in both image and filter must match.")  
    if conv_filter.shape[1] != conv_filter.shape[2]: 
        print('Error: Filter must be a square matrix. I.e. number of rows and columns must match.')  
    if conv_filter.shape[1]%2==0: 
        print('Error: Filter must have an odd size. I.e. number of rows and columns must be odd.')  

Then an empty feature map is added, the image is convolved by the filter, and the results of all convolutions are summed in the single feature map. Conv_map is an array that holds the sum of all feature maps.

   feature_maps = numpy.zeros((img.shape[0]-conv_filter.shape[1]+1,   

  for filter_num in range(conv_filter.shape[0]):  
       print("Filter ", filter_num + 1)  
       curr_filter = conv_filter[filter_num, :]
       if len(curr_filter.shape) > 2:  
           conv_map = conv_(img[:, :, 0], curr_filter[:, :, 0]) 
           for ch_num in range(1, curr_filter.shape[-1]):
               conv_map = conv_map + conv_(img[:, :, ch_num],   
                                 curr_filter[:, :, ch_num])  
            conv_map = conv_(img, curr_filter)  
       feature_maps[:, :, filter_num] = conv_map
    return feature_maps
  1. ReLU Layer

Here is how you apply a ReLu activation after the convolution operation:

l1_feature_map_relu = relu(l1_feature_map)

The relu function is implemented as follows. It loops through every element in the feature map and returns the value if larger than 0, otherwise 0.

def relu(feature_map):  
   #Preparing the output of the ReLU activation function.  
   relu_out = numpy.zeros(feature_map.shape)  
   for map_num in range(feature_map.shape[-1]):  
       for r in numpy.arange(0,feature_map.shape[0]):  
           for c in numpy.arange(0, feature_map.shape[1]):  
               relu_out[r, c, map_num] = numpy.max(feature_map[r, c, map_num], 0)
  1. Max Pooling Layer

You apply max pooling on the results of the first convolution as follows:

l1_feature_map_relu_pool = pooling(l1_feature_map_relu, 2, 2)

Pooling is implemented as follows. The pooling function we define accepts the output of the ReLU layer, pooling mask size, and stride. It loops through the input, channel by channel, and for each channel in the input, applies the max pooling operation. The pool_out function clips the region and returns the max number according to the stride and size used.

def pooling(feature_map, size=2, stride=2):  
    pool_out = numpy.zeros((numpy.uint16((feature_map.shape[0]-size+1)/stride),  
    for map_num in range(feature_map.shape[-1]):  
        r2 = 0  
        for r in numpy.arange(0,feature_map.shape[0]-size-1, stride):  
            c2 = 0  
            for c in numpy.arange(0, feature_map.shape[1]-size-1, stride):  
                pool_out[r2, c2, map_num] = numpy.max(feature_map[r:r+size,  c:c+size])  
                c2 = c2 + 1  
                r2 = r2 +1  
  1. Building the remaining layers in the CNN Model

Here is how to stack the remaining layers to build a full CNN model. We define a second and third convolution, with ReLu and pooling steps in between.

l2_filter = numpy.random.rand(3, 5, 5, l1_feature_map_relu_pool.shape[-1])  
l2_feature_map = conv(l1_feature_map_relu_pool, l2_filter)  
l2_feature_map_relu = relu(l2_feature_map)  
l2_feature_map_relu_pool = pooling(l2_feature_map_relu, 2, 2)  

l3_filter = numpy.random.rand(1, 7, 7, l2_feature_map_relu_pool.shape[-1])  
l3_feature_map = conv(l2_feature_map_relu_pool, l3_filter)  
l3_feature_map_relu = relu(l3_feature_map)  
l3_feature_map_relu_pool = pooling(l3_feature_map_relu, 2, 2)

And that’s it! You just built a full CNN architecture from scratch in NumPy.

Convolutional Neural Networks in the Real World

In this article we explained the basics of Python for deep learning and provided two tutorials to create your own Convolutional Neural Networks in Python. When you start working on CNN projects, processing and generating predictions for real images, audio and video, you’ll run into some practical challenges:

  • tracking experiments

    Tracking experiment progress, source code, and hyperparameters across multiple CNN experiments. CNNs can have many variations and hyperparameter tweaks, and testing each will require running multiple experiments and tracking their results.

  • running experiment across multiple machines

    Running experiments across multiple machines—CNNs are computationally intensive, and you will probably need to run on multiple machines or specialized GPU hardware. Provisioning these machines, configuring them and distributing the work among them can be difficult.

  • manage training datasets

    Manage training data—CNN projects often involve images or other rich media, and training sets can weight anywhere Gigabytes upwards. Copying data to training machines and re-copying it for every new experiment is time consuming and error prone.


MissingLink is a deep learning platform that can help you automate these operational aspects of CNN, so you can concentrate on building winning experiments.

Learn more about MissingLink and see how easy it is.

Train Deep Learning Models 20X Faster

Let us show you how you can:

  • Run experiments across hundreds of machines
  • Easily collaborate with your team on experiments
  • Reproduce experiments with one click
  • Save time and immediately understand what works and what doesn’t

MissingLink is the most comprehensive deep learning platform to manage experiments, data, and resources more frequently, at scale and with greater confidence.

Request your personal demo to start training models faster

    Thank you!
    We will be in touch with more information in one business day.
    In the meantime, why not check out how Nanit is using MissingLink to streamline deep learning training and accelerate time to Market.