Important announcement: Missinglink has shut down. Click here to learn more.

Deep Learning Frameworks Cover

Keras

Keras Conv2D: Working with CNN 2D Convolutions in Keras

This article explains how to create 2D convolutional layers in Keras, as part of a Convolutional Neural Network (CNN) architecture.

2D convolutional layers take a three-dimensional input, typically an image with three color channels. They pass a filter, also called a convolution kernel, over the image, inspecting a small window of pixels at a time, for example 3×3 or 5×5 pixels in size, and moving the window until they have scanned the entire image. The convolution operation calculates the dot product of the pixel values in the current filter window with the weights defined in the filter.

Conv2D: 2d convolution layer

In Keras, you create 2D convolutional layers using the keras.layers.Conv2D() function. Unlike in the TensorFlow Conv2D process, you don’t have to define variables or separately construct the activations and pooling, Keras does this automatically for you.

This code sample creates a 2D convolutional layer in Keras.

keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

To understand the parameters in detail, see Understanding and Tuning Conv2D Parameters below.

We will also show how to run CNN at scale across dozens of machines, both on and off the cloud, using the MissingLink deep learning platform.

What is a 2D Convolution Layer, the Convolution Kernel and its Role in CNN Image Classification

Briefly, some background. A convolution layer “scans” A source image with a filter of, for example, 5×5 pixels, to extract features which may be important for classification. This filter is also called the convolution kernel. The kernel also contains weights, which are tuned in the training of the model to achieve the most accurate predictions.

In a 5×5 kernel, for each 5×5 pixel region, the model computes the dot products between the image pixel values and the weights defined in the filter.

A 2D convolution layer means that the input of the convolution operation is three-dimensional, for example, a color image which has a value for each pixel across three layers: red, blue and green. However, it is called a “2D convolution” because the movement of the filter across the image happens in two dimensions. The filter is run across the image three times, once for each of the three layers.

After the convolution ends, the features are downsampled, and then the same convolutional structure repeats again. At first, the convolution identifies features in the original image (for example in a cat, the body, legs, tail, head), then it identifies sub-features within smaller parts of the image (for example, within the head, the ears, whiskers, eyes). Eventually, this process is meant to identify the essential features that can help classify the image. Learn more in our guide to Convolutional Neural Networks (CNN).


Building a Convolutional Neural Network in Keras: A Brief Primer

To help you understand the Conv2D operation, here is a quick primer on how to build Convolutional Neural Networks in Keras.

A CNN architecture has three main parts:

  • A convolutional layer that extracts features from a source image.
  • A pooling layer that downsamples each feature to reduce its dimensionality and focus on the most important elements.
  • A fully connected layer that flattens the features identified in the previous layers into a vector, and predicts probabilities that the image belongs to each one of several possible labels.

CNN process in TensorFlow or Keras

 

In Keras, you build a CNN architecture using the following process:

1. Reshape the input data into a format suitable for the convolutional layers, using X_train.reshape() and X_test.reshape()

2. For class-based classification, one-hot encode the categories using the to_categorical() function.

3. Build the model using the Sequential.add() function. For a 2D convolutional layer, the command looks like the following.

model.add(Conv2D(64, kernel_size=3, activation=’relu’, input_shape=(28,28,1)))

4. Add a pooling layer, for example using the Sequential.add(MaxPooling2D()) function – not showing all parameters.

5. Add a “flatten” layer which prepares a vector for the fully connected layers, using Sequential.add(Flatten()).

6. Add one or more fully connected layer using Sequential.add(Dense)). Typically you will follow each fully connected layer with a dropout layer (learn more about dropout in our guide to neural network hyperparameters ), using Sequential.add(Dropout)).

7. Compile the model using model.compile()

8. Train the model using model.fit(), supplying X_train() and X_test() which are the source images; y_train() and y_test() which are known classification results.

9. Use model.predict() to generate a prediction.


Keras CNN example and Keras Conv2D

Here is a simple code example to show you the context of Conv2D in a complete Keras model. The example was created by Andy Thomas. This model has two 2D convolutional layers, highlighted in the code.

# building the mode
model = Sequential()
model.add(Conv2D(32, kernel_size=(5, 5), strides=(1, 1),
                 activation='relu',
                 input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(1000, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

# training
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.SGD(lr=0.01),
              metrics=['accuracy'])
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test),
          callbacks=[history])

# evaluating and printing results
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Understanding and Tuning the Parameters of Keras Conv2D

When adding a Conv2D layer using Sequential.model.add(), there are numerous parameters you can use, as defined in the underlying keras.layers.conv2D() function (see documentation).

 

Here is the full signature of the Keras Conv2D function:

keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

Below we explain each of these parameters, what it does, and some best practices for setting and tuning it. To get more background about tuning neural networks, see our guide on neural network hyperparameters.

Keras Conv2D ParameterWhat it DoesBest Practices and Tuning
filtersSets the number of filters used in the convolution operation.Earlier 2D convolutional layers, closer to the input, learn less filters, while later convolutional layers, closer to the output, learn more filters. The number of filters you select should depend on the complexity of your dataset and the depth of your neural network. A common setting to start with is [32, 64, 128] for three layers, and if there are more layers, increasing to [256, 512, 1024], etc.
kernel_sizeSpecifies the size of the convolutional filter in pixels. Must be an odd integer.Filter size may be determined by the CNN architecture you are using – for example VGGNet exclusively uses (3, 3) filters. If not, use a 5×5 or 7×7 filter to learn larger features and then quickly reduce to 3×3. If your images are smaller than 128×128, consider working with smaller filters of 1×1 and 3×3.
strides=(1, 1)The strides parameter is a 2-tuple of integers, specifying how the convolutional filter should “step” along the x and y-axis of the source image.In most cases, it’s okay to leave the strides parameter with the default (1, 1). However, you may increase it to (2, 2) to reduce the size of the output volume.
padding='valid'The padding parameter has two values: valid or same. Valid means the input is not zero-padded, so the output of the convolution will be smaller than the dimensions of the original image. Same means the input will be zero-padded, so the convolution output can be the same size as the input.The default Keras value is valid, but it is often effective to set it to same for most of the layers, then reduce spatial dimensions using max pooling or strided convolutions.
data_format=NoneSpecifies the order of data in the input received from the backend deep learning framework: channels_last or channels_firstThe TensorFlow backend to Keras uses channels last ordering. Do not change this parameter unless you are using Theano as your backend.

 

dilation_rate=(1, 1)A 2-tuple of integers, controlling the dilation rate for dilated convolution. Dilated convolution is a convolution applied to the input volume with defined gaps (the filter does not scan the entire image, skipping certain segments).Dilated convolutions are useful for working with higher resolution images, but wanting to still focus on fine-grained details, or when constructing a network with fewer parameters.
activation=NoneThe activation parameter specifies the name of the activation function you want to apply after performing the convolution.To learn more about activation functions and their impact on your neural network, see our guide to neural network activation functions.
use_bias=TrueThe use_bias  parameter of the Conv2D class controls whether a bias vector is added to the convolutional layer.Typically you’ll want to leave this value as True, although some implementations of ResNet will leave the bias parameter out.
kernel_initializer='glorot_uniform'The initialization method used to initialize all values in the Conv2D class prior to training.The default is glorot_uniform, which is Xavier Glorot uniform initialization. This is suitable for most CNNs. For deeper networks, such as VGGnet, you may want to use  he_normal which uses the MSRA initialization method.
bias_initializer='zeros'Controls how the bias vector is initialized before training starts.You should typically leave this as the default, zeroes, meaning the bias will be initially filled by zeroes.
kernel_regularizer=NoneThese parameters control the type and amount of regularization. Regularization is a method which helps avoid overfitting and improve the ability of your model to generalize from training examples to a real population.For large datasets and deep networks, kernel regularization is a must. You can use either L1 or L2 regularization. If you detect signs of overfitting, consider using L2 regularization. Tune the amount of regularization, starting with values of 0.0001-0.001. For bias and activity, we recommend leaving at the default values for most scenarios.
bias_regularizer=None
activity_regularizer=None
kernel_constraint=NoneImpose constraints on the Conv2D layer, such as unit normalization, non-negativity, min-max normalization.These are advanced settings which should be left at defaults unless you have a special reason to use them in your model.
bias_constraint=None

 


Running CNN at Scale on Keras with MissingLink

In this article, we explained how to create 2D convolutional layers in Keras. When you start working on Convolutional Neural Networks and running large numbers of experiments, you’ll run into some practical challenges:

  • tracking experiments

    Tracking Experiments

    Tracking experiment progress and hyperparameters can be challenging when you run a large number of experiments. You will have to scale up your experiments to tune your CNN and try all relevant variations of network architecture and hyperparameters.

  • running experiment across multiple machines

    Running experiments on multiple machines

    CNNs can take a long time to run, especially with large datasets. You will want to run your CNNs on more machines and GPUs, either on-premise or in the cloud. It can be very time consuming to provision these machines, distribute experiments between them and monitor progress.

  • manage training datasets

    Manage training data

    Computer vision projects with images, video or other rich media, training sets can have very large datasets. Copying the data to each training machine, replacing it for each new experiment and managing changes to datasets can be difficult. To scale up you must do this in an automated way.

 

MissingLink is a deep learning platform that does all of this for you, and lets you concentrate on building the most accurate model. Learn more to see how easy it is.

Train Deep Learning Models 20X Faster

Let us show you how you can:

  • Run experiments across hundreds of machines
  • Easily collaborate with your team on experiments
  • Reproduce experiments with one click
  • Save time and immediately understand what works and what doesn’t

MissingLink is the most comprehensive deep learning platform to manage experiments, data, and resources more frequently, at scale and with greater confidence.

Request your personal demo to start training models faster

    Thank you!
    We will be in touch with more information in one business day.
    In the meantime, why not check out how Nanit is using MissingLink to streamline deep learning training and accelerate time to Market.