Working with 1D Convolutional Neural Networks in Keras
Two Quick Tutorials
Introduction and Hands-On Tutorial
Building, Training and Scaling Residual Networks on TensorFlow
Working with CNN Max Pooling Layers in TensorFlow
Two Quick Tutorials
Tutorials
Three Quick Tutorials
Three Quick Tutorials
Introduction and Tutorials
A Practical Guide
Building, Training and Scaling Residual Networks on PyTorch
Building, Training & Scaling Residual Nets on Keras
Working with CNN 2D Convolutions in Keras
Key Approaches and Tutorials
Introduction and Examples
Three Examples
Keras provides convenient methods for creating Convolutional Neural Networks (CNNs) of 1, 2, or 3 dimensions: Conv1D, Conv2D and Conv3D. This page explains what 1D CNN is used for, and how to create one in Keras, focusing on the Conv1D function and its parameters. To get you started, we’ll provide you with a a quick Keras Conv1D tutorial. Because training 1D CNN is computationally intensive and time-consuming, we will also show how to scale up CNNs with the MissingLink deep learning platform.
Convolutional Neural Network (CNN) models were developed for image classification, in which the model accepts a two-dimensional input representing an image’s pixels and color channels, in a process called feature learning.
This same process can be applied to one-dimensional sequences of data. The model extracts features from sequences data and maps the internal features of the sequence. A 1D CNN is very effective for deriving features from a fixed-length segment of the overall dataset, where it is not so important where the feature is located in the segment.
1D Convolutional Neural Networks work well for:
CNNs work the same way whether they have 1, 2, or 3 dimensions. The difference is the structure of the input data and how the filter, also known as a convolution kernel or feature detector, moves across the data.
In this natural language processing (NLP) example, a sentence is made up of 9 words. Each word is a vector that represents a word. The filter covers at least one word; a height parameter specifies how many words the filter should consider at once. In this example the height is 2, meaning the filter moves 8 times to fully scan the data.
In a 2D convolutional network, each pixel within the image is represented by its x and y position as well as the depth, representing image channels (red, green, and blue). The filter in this example is 2×2 pixels. It moves over the images both horizontally and vertically.
Another difference between 1D and 2D networks is that 1D networks allow you to use larger filter sizes. In a 1D network, a filter of size 7 or 9 contains only 7 or 9 feature vectors. Whereas in a 2D CNN, a filter of size 7 will contain 49 feature vectors, making it a very broad selection.
Another difference, though, is the fact that you can afford to use larger convolution windows with 1D CNNs. With a 2D convolution layer, a 3 × 3 convolution window contains 3 × 3 = 9 feature vectors. With 1D convolution layer, a window of size 3 contains only 3 feature vectors. You can thus easily afford 1D convolution windows of size 7 or 9.
This Keras Conv1D example is based on the excellent tutorial by Jason Brownlee. It shows how to develop one-dimensional convolutional neural networks for time series classification, using the problem of human activity recognition.
1. Load training and testing datasets
def load_dataset(prefix=''): # load all train trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/') print(trainX.shape, trainy.shape) # load all test testX, testy = load_dataset_group('test', prefix + 'HARDataset/') print(testX.shape, testy.shape) # zero-offset class values trainy = trainy - 1 testy = testy - 1 # one hot encode y trainy = to_categorical(trainy) testy = to_categorical(testy) print(trainX.shape, trainy.shape, testX.shape, testy.shape) return trainX, trainy, testX, testy
2. Fit and evaluate model
Now that we have the data loaded into memory ready for modeling, we can define, fit, and evaluate a 1D CNN model. The Conv1D operation is highlighted in the code.
def evaluate_model(trainX, trainy, testX, testy): verbose, epochs, batch_size = 0, 10, 32 n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] model = Sequential() model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features))) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(Dropout(0.5)) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # fit network model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose) # evaluate model _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0) return accuracy
3. Summarize results
We will repeat the evaluation of the model multiple times, then summarize the performance of the model across each of those runs. For example, we can call evaluate_model() a total of 10 times. This will result in a population of model evaluation scores that must be summarized.
def summarize_results(scores): print(scores) m, s = mean(scores), std(scores) print('Accuracy: %.3f%% (+/-%.3f)' % (m, s)) # run an experiment def run_experiment(repeats=10): # load data trainX, trainy, testX, testy = load_dataset() # repeat experiment scores = list() for r in range(repeats): score = evaluate_model(trainX, trainy, testX, testy) score = score * 100.0 print('>#%d: %.3f' % (r+1, score)) scores.append(score) # summarize results summarize_results(scores) # run the experiment run_experiment()
You can find more information on the official Keras documentation page. For more Keras Conv1D tutorials, see this post.
A 1D convolution layer creates a convolution kernel that passes over a single spatial (or temporal) dimension to produce a tensor of outputs (see documentation).
Here is the full signature of the Keras Conv1D function:
keras.layers.Conv1D(filters, kernel_size, strides=1, padding='valid', data_format='channels_last', dilation_rate=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)
Below we explain each of these parameters, what it does, and some best practices for setting and tuning it. To get more background about tuning neural networks, see our guide on neural network hyperparameters.
Keras Conv2D Parameter | What it Does | Data Type |
filters | Sets the number of output filters used in the convolution operation. | Integer |
kernel_size | Specifies the size of the convolutional window. | An integer or tuple/list of a single integer |
strides | The strides parameter is specifying the shift size of the convolution window. | An integer or tuple/list of a single integer |
padding | The padding parameter has three values: valid , causal or same . Valid means the input is not zero-padded, so the output of the convolution will be smaller than the dimensions of the original image. Same means the input will be zero padded, so the convolution output can be the same size as the input. Causal means that the output does not depend on the input. | A string of “valid”, “causal” or “same”. Useful when modeling temporal data where the model should not violate the temporal order. |
data_format | Specifies the order of the dimensions in the inputs: channels_last or channels_first | A string, one of “channels_last” (default) or “channels_first” |
dilation_rate | Controlling the dilation rate for dilated convolution. Dilated convolution is a convolution applied to the input volume with defined gaps (the filter does not scan the entire image, skipping certain segments). | Integer or tuple/list |
activation | The activation parameter specifies the name of the activation function you want to apply after performing the convolution. If the parameter is not specified, no activation is applied | Activation functions can either be applied through the activation argument or by creating a separate Activation layer. See our in-depth guide to neural network activation functions. |
use_bias | The use_bias parameter controls whether a bias vector is added to the convolutional layer. | Boolean |
kernel_initializer | A function applied to the kernel weights matrix. Used to initialize all values prior to training. | A keyword argument |
bias_initializer | Controls how the bias vector is initialized before training starts. | A keyword argument |
kernel_regularizer | These parameters control the type and amount of regularization. Regularization is a method which helps avoid overfitting and improve the ability of your model to generalize from training examples to a real population. | Regularizer function |
bias_regularizer | ||
activity_regularizer | ||
kernel_constraint | Impose constraints on the kernel matrix. | Constraint function |
See the Keras documentation for more information on initializers and regularizers.
In this article, we explained how to create a 1D Convolutional Neural Network in Keras with the Conv1D method. When you start working on Convolutional Neural Networks and running large numbers of experiments, you’ll run into some practical challenges:
Tracking experiment progress can be challenging when you run a large number of experiments to tune hyperparameters. You will have to scale up your experiments on multiple machines to perform trial and error on your models in a reasonable period of time.
Convolutional networks can take a long time to run. You will typically run CNNs on GPUs, either on-premise or in the cloud, and executing the experiment on multiple machines can be time-consuming and waste resources, due to idle time and inefficient resource allocation.
Even with Conv1D networks that primarily process text, social media, and other datasets can be very large. Copying the data to training machines, replacing it for new experiment and tweaking the dataset to improve results can become a major burden.
MissingLink is a deep learning platform that does all of this for you and lets you concentrate on building the most accurate model. Learn more to see how easy it is.
The most comprehensive platform to manage experiments, data and resources more frequently, at scale and with greater confidence.
The most comprehensive platform to manage experiments, data and resources more frequently, at scale and with greater confidence.
MissingLink is the most comprehensive deep learning platform to manage experiments, data, and resources more frequently, at scale and with greater confidence.
Request your personal demo to start training models faster