Deep Learning Frameworks

Keras Conv1D: Working with 1D Convolutional Neural Networks in Keras

Keras provides convenient methods for creating Convolutional Neural Networks (CNNs) of 1, 2, or 3 dimensions: Conv1D, Conv2D  and Conv3D. This page explains what 1D CNN is used for, and how to create one in Keras, focusing on the Conv1D function and its parameters.

 

In this page you will learn:

  • What are 1D Convolutional Neural Networks
  • The difference between 1D and 2D CNN
  • Keras CNN example with Keras Conv1D
  • Understanding Keras Conv1D parameters

What are 1D Convolutional Neural Networks?

Convolutional Neural Network (CNN)  models were developed for image classification, in which the model accepts a two-dimensional input representing an image’s pixels and color channels, in a process called feature learning.

This same process can be applied to one-dimensional sequences of data. The model extracts features from sequences data and maps the internal features of the sequence. A 1D CNN is very effective for deriving features from a fixed-length segment of the overall dataset, where it is not so important where the feature is located in the segment.

 

1D Convolutional Neural Networks work well for:

 

  • Analysis of a time series of sensor data.
  • Analysis of signal data over a fixed-length period, for example, an audio recording.
  • Natural Language Processing (NLP), although Recurrent Neural Networks which leverage Long Short Term Memory (LSTM) cells are more promising than CNN as they take into account the proximity of words to create trainable patterns.

What is the Difference Between a 1D CNN and a 2D CNN?

CNNs work the same way whether they have 1, 2, or 3 dimensions. The difference is the structure of the input data and how the filter, also known as a convolution kernel or feature detector, moves across the data.

1D Convolutional Neural Network Example

 

 

In this natural language processing (NLP) example, a sentence is made up of 9 words. Each word is a vector that represents a word. The filter covers at least one word; a height parameter specifies how many words the filter should consider at once. In this example the height is 2, meaning the filter moves 8 times to fully scan the data.

 

2D Convolutional Example

 

In a 2D convolutional network, each pixel within the image is represented by its x and y position as well as the depth, representing image channels (red, green, and blue). The filter in this example is 2×2 pixels. It moves over the images both horizontally and vertically.

 

Another difference between 1D and 2D networks is that 1D networks allow you to use larger filter sizes. In a 1D network, a filter of size 7 or 9 contains only 7 or 9 feature vectors. Whereas in a 2D CNN, a filter of size 7 will contain 49 feature vectors, making it a very broad selection.

 

Another difference, though, is the fact that you can afford to use larger convolution windows with 1D CNNs. With a 2D convolution layer, a 3 × 3 convolution window contains 3 × 3 = 9 feature vectors. With 1D convolution layer, a window of size 3 contains only 3 feature vectors. You can thus easily afford 1D convolution windows of size 7 or 9.

Keras CNN Example with Keras Conv1D

This example is based on the excellent tutorial by Jason Brownlee. It shows how to develop one-dimensional convolutional neural networks for time series classification, using the problem of human activity recognition.

 

  1. Load training and testing datasets
def load_dataset(prefix=''):
	# load all train
	trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/')
	print(trainX.shape, trainy.shape)
	# load all test
	testX, testy = load_dataset_group('test', prefix + 'HARDataset/')
	print(testX.shape, testy.shape)
	# zero-offset class values
	trainy = trainy - 1
	testy = testy - 1
	# one hot encode y
	trainy = to_categorical(trainy)
	testy = to_categorical(testy)
	print(trainX.shape, trainy.shape, testX.shape, testy.shape)
	return trainX, trainy, testX, testy

 

  1. Fit and evaluate model

Now that we have the data loaded into memory ready for modeling, we can define, fit, and evaluate a 1D CNN model. The Conv1D operation is highlighted in the code.

def evaluate_model(trainX, trainy, testX, testy):
	verbose, epochs, batch_size = 0, 10, 32
	n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1]
	model = Sequential()
	model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features)))
	model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
	model.add(Dropout(0.5))
	model.add(MaxPooling1D(pool_size=2))
	model.add(Flatten())
	model.add(Dense(100, activation='relu'))
	model.add(Dense(n_outputs, activation='softmax'))
	model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
	# fit network
	model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose)
	# evaluate model
	_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
	return accuracy

 

  1. Summarize results

We will repeat the evaluation of the model multiple times, then summarize the performance of the model across each of those runs. For example, we can call evaluate_model() a total of 10 times. This will result in a population of model evaluation scores that must be summarized.

def summarize_results(scores):
	print(scores)
	m, s = mean(scores), std(scores)
	print('Accuracy: %.3f%% (+/-%.3f)' % (m, s))
 
# run an experiment
def run_experiment(repeats=10):
	# load data
	trainX, trainy, testX, testy = load_dataset()
	# repeat experiment
	scores = list()
	for r in range(repeats):
		score = evaluate_model(trainX, trainy, testX, testy)
		score = score * 100.0
		print('>#%d: %.3f' % (r+1, score))
		scores.append(score)
	# summarize results
	summarize_results(scores)
 
# run the experiment
run_experiment()

Understanding Keras Conv1D Parameters

A 1D convolution layer creates a convolution kernel that passes over a single spatial (or temporal) dimension to produce a tensor of outputs (see documentation).

Here is the full signature of the Keras Conv1D function:

keras.layers.Conv1D(filters, kernel_size, strides=1, padding='valid', data_format='channels_last', dilation_rate=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

Below we explain each of these parameters, what it does, and some best practices for setting and tuning it. To get more background about tuning neural networks, see our guide on neural network hyperparameters.

Keras Conv2D ParameterWhat it DoesData Type
filtersSets the number of output filters used in the convolution operation.Integer
kernel_sizeSpecifies the size of the convolutional window.An integer or tuple/list of a single integer
stridesThe strides parameter is specifying the shift size of the convolution window.An integer or tuple/list of a single integer
paddingThe padding parameter has three values: valid, causal or same. Valid means the input is not zero-padded, so the output of the convolution will be smaller than the dimensions of the original image. Same means the input will be zero padded, so the convolution output can be the same size as the input. Causal means that the output does not depend on the input.A string of “valid”, “causal” or “same”. Useful when modeling temporal data where the model should not violate the temporal order.
 
data_formatSpecifies the order of the dimensions in the inputs: channels_last or channels_firstA string, one of “channels_last” (default) or “channels_first”
dilation_rateControlling the dilation rate for dilated convolution. Dilated convolution is a convolution applied to the input volume with defined gaps (the filter does not scan the entire image, skipping certain segments).Integer or tuple/list
activationThe activation parameter specifies the name of the activation function you want to apply after performing the convolution. If the parameter is not specified, no activation is appliedActivation functions can either be applied through the activation argument or by creating a separate Activation layer. See our in-depth guide to neural network activation functions.
use_biasThe use_bias parameter controls whether a bias vector is added to the convolutional layer.Boolean
kernel_initializerA function applied to the kernel weights matrix. Used to initialize all values prior to training.A keyword argument
bias_initializerControls how the bias vector is initialized before training starts.A keyword argument
kernel_regularizerThese parameters control the type and amount of regularization. Regularization is a method which helps avoid overfitting and improve the ability of your model to generalize from training examples to a real population.Regularizer function
bias_regularizer
activity_regularizer
kernel_constraintImpose constraints on the kernel matrix.Constraint function

 

See the Keras documentation for more information on initializers and regularizers.

Running CNN at Scale on Keras with MissingLink

In this article we explained how to create a 1D Convolutional Neural Network in Keras with the Conv1D method. When you start working on Convolutional Neural Networks and running large numbers of experiments, you’ll run into some practical challenges:

tracking experiments

Tracking experiment progress can be challenging when you run a large number of experiments to tune hyperparameters. You will have to scale up your experiments on multiple machines to perform trial and error on your models in a reasonable period of time.

running experiment across multiple machines

Scaling experiments across machines—convolutional networks can take a long time to run. You will typically run CNNs on GPUs, either on-premise or in the cloud, and executing the experiment on multiple machines can be time-consuming and waste resources, due to idle time and inefficient resource allocation.

manage training datasets

Manage training data—even with Conv1D networks that primarily process text, social media, and other datasets can be very large. Copying the data to training machines, replacing it for new experiment and tweaking the dataset to improve results can become a major burden.

MissingLink is a deep learning platform that can help you automate CNN on Keras, so you can concentrate on building winning experiments. Sign up for free to see how easy it is.

Learn More About Deep Learning Frameworks