Keras Conv1D: Working with 1D Convolutional Neural Networks in Keras
Keras provides convenient methods for creating Convolutional Neural Networks (CNNs) of 1, 2, or 3 dimensions: Conv1D, Conv2D and Conv3D. This page explains what 1D CNN is used for, and how to create one in Keras, focusing on the Conv1D function and its parameters.
In this page you will learn:
- What are 1D Convolutional Neural Networks
- The difference between 1D and 2D CNN
- Keras CNN example with Keras Conv1D
- Understanding Keras Conv1D parameters
Convolutional Neural Network (CNN) models were developed for image classification, in which the model accepts a two-dimensional input representing an image’s pixels and color channels, in a process called feature learning.
This same process can be applied to one-dimensional sequences of data. The model extracts features from sequences data and maps the internal features of the sequence. A 1D CNN is very effective for deriving features from a fixed-length segment of the overall dataset, where it is not so important where the feature is located in the segment.
1D Convolutional Neural Networks work well for:
- Analysis of a time series of sensor data.
- Analysis of signal data over a fixed-length period, for example, an audio recording.
- Natural Language Processing (NLP), although Recurrent Neural Networks which leverage Long Short Term Memory (LSTM) cells are more promising than CNN as they take into account the proximity of words to create trainable patterns.
CNNs work the same way whether they have 1, 2, or 3 dimensions. The difference is the structure of the input data and how the filter, also known as a convolution kernel or feature detector, moves across the data.
In this natural language processing (NLP) example, a sentence is made up of 9 words. Each word is a vector that represents a word. The filter covers at least one word; a height parameter specifies how many words the filter should consider at once. In this example the height is 2, meaning the filter moves 8 times to fully scan the data.
In a 2D convolutional network, each pixel within the image is represented by its x and y position as well as the depth, representing image channels (red, green, and blue). The filter in this example is 2×2 pixels. It moves over the images both horizontally and vertically.
Another difference between 1D and 2D networks is that 1D networks allow you to use larger filter sizes. In a 1D network, a filter of size 7 or 9 contains only 7 or 9 feature vectors. Whereas in a 2D CNN, a filter of size 7 will contain 49 feature vectors, making it a very broad selection.
Another difference, though, is the fact that you can afford to use larger convolution windows with 1D CNNs. With a 2D convolution layer, a 3 × 3 convolution window contains 3 × 3 = 9 feature vectors. With 1D convolution layer, a window of size 3 contains only 3 feature vectors. You can thus easily afford 1D convolution windows of size 7 or 9.
This example is based on the excellent tutorial by Jason Brownlee. It shows how to develop one-dimensional convolutional neural networks for time series classification, using the problem of human activity recognition.
1. Load training and testing datasets
def load_dataset(prefix=''): # load all train trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/') print(trainX.shape, trainy.shape) # load all test testX, testy = load_dataset_group('test', prefix + 'HARDataset/') print(testX.shape, testy.shape) # zero-offset class values trainy = trainy - 1 testy = testy - 1 # one hot encode y trainy = to_categorical(trainy) testy = to_categorical(testy) print(trainX.shape, trainy.shape, testX.shape, testy.shape) return trainX, trainy, testX, testy
2. Fit and evaluate model
Now that we have the data loaded into memory ready for modeling, we can define, fit, and evaluate a 1D CNN model. The Conv1D operation is highlighted in the code.
def evaluate_model(trainX, trainy, testX, testy): verbose, epochs, batch_size = 0, 10, 32 n_timesteps, n_features, n_outputs = trainX.shape, trainX.shape, trainy.shape model = Sequential() model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features))) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(Dropout(0.5)) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # fit network model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose) # evaluate model _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0) return accuracy
3. Summarize results
We will repeat the evaluation of the model multiple times, then summarize the performance of the model across each of those runs. For example, we can call evaluate_model() a total of 10 times. This will result in a population of model evaluation scores that must be summarized.
def summarize_results(scores): print(scores) m, s = mean(scores), std(scores) print('Accuracy: %.3f%% (+/-%.3f)' % (m, s)) # run an experiment def run_experiment(repeats=10): # load data trainX, trainy, testX, testy = load_dataset() # repeat experiment scores = list() for r in range(repeats): score = evaluate_model(trainX, trainy, testX, testy) score = score * 100.0 print('>#%d: %.3f' % (r+1, score)) scores.append(score) # summarize results summarize_results(scores) # run the experiment run_experiment()
A 1D convolution layer creates a convolution kernel that passes over a single spatial (or temporal) dimension to produce a tensor of outputs (see documentation).
Here is the full signature of the Keras Conv1D function:
keras.layers.Conv1D(filters, kernel_size, strides=1, padding='valid', data_format='channels_last', dilation_rate=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)
Below we explain each of these parameters, what it does, and some best practices for setting and tuning it. To get more background about tuning neural networks, see our guide on neural network hyperparameters.
|Keras Conv2D Parameter||What it Does||Data Type|
|Sets the number of output filters used in the convolution operation.||Integer|
|Specifies the size of the convolutional window.||An integer or tuple/list of a single integer|
|The strides parameter is specifying the shift size of the convolution window.||An integer or tuple/list of a single integer|
|The padding parameter has three values: ||A string of “valid”, “causal” or “same”. Useful when modeling temporal data where the model should not violate the temporal order.|
|Specifies the order of the dimensions in the inputs: ||A string, one of “channels_last” (default) or “channels_first”|
|Controlling the dilation rate for dilated convolution. Dilated convolution is a convolution applied to the input volume with defined gaps (the filter does not scan the entire image, skipping certain segments).||Integer or tuple/list|
|The activation parameter specifies the name of the activation function you want to apply after performing the convolution. If the parameter is not specified, no activation is applied||Activation functions can either be applied through the activation argument or by creating a separate Activation layer. See our in-depth guide to neural network activation functions.|
|A function applied to the kernel weights matrix. Used to initialize all values prior to training.||A keyword argument|
|Controls how the bias vector is initialized before training starts.||A keyword argument|
|These parameters control the type and amount of regularization. Regularization is a method which helps avoid overfitting and improve the ability of your model to generalize from training examples to a real population.||Regularizer function|
|Impose constraints on the kernel matrix.||Constraint function|
Running CNN at Scale on Keras with MissingLink
In this article, we explained how to create a 1D Convolutional Neural Network in Keras with the Conv1D method. When you start working on Convolutional Neural Networks and running large numbers of experiments, you’ll run into some practical challenges:
Tracking experiment progress can be challenging when you run a large number of experiments to tune hyperparameters. You will have to scale up your experiments on multiple machines to perform trial and error on your models in a reasonable period of time.
Scaling experiments across machines—convolutional networks can take a long time to run. You will typically run CNNs on GPUs, either on-premise or in the cloud, and executing the experiment on multiple machines can be time-consuming and waste resources, due to idle time and inefficient resource allocation.
Manage training data—even with Conv1D networks that primarily process text, social media, and other datasets can be very large. Copying the data to training machines, replacing it for new experiment and tweaking the dataset to improve results can become a major burden.