Important announcement: Missinglink has shut down. Click here to learn more.

Deep Learning Frameworks Cover

TensorFlow

TensorFlow Image Classification: Three Quick Tutorials

TensorFlow can help you build neural network models to classify images. Commonly, these will be Convolutional Neural Networks (CNN). TensorFlow is a powerful framework that lets you define, customize and tune many types of CNN architectures. MissingLink’s deep learning platform provides an additional layer for tracking and managing TensorFlow projects.

 

A typical CNN process in TensorFlow looks like this:

CNN process in TensorFlow

Following is a typical process to perform TensorFlow image classification:

 

  1. Pre-process data to generate the input of the neural network – to learn more see our guide on Using Neural Networks for Image Recognition.
  2. Reshape input if necessary using tf.reshape() to match the convolutional layer you intend to build (for example, if using a 2D convolution, reshape it into three-dimensional format)
  3. Create a convolutional layer using tf.nn.conv1d(), tf.nn.conv2d(), or tf.nn.conv3d, depending on the dimensionality of the input.
  4. Create a poling layer using tf.nn.maxpool()
  5. Repeat steps 2 and 3 for additional convolution and pooling layers
  6. Reshape output of convolution and pooling layers, flattening it to prepare for the fully connected layer
  7. Create a fully connected layer using tf.matmul() function, add an activation using, for example, tf.nn.relu() (see all TensorFlow activations, or learn more in our guide to Neural Network Activation Functions), and apply a dropout using tf.nn.dropout() (learn more about dropout in our guide to neural network hyperparameters)
  8. Create a final layer for class prediction, again using tf.matmul()
  9. Store weights and biases using TensorFlow variables

 

These are just the basic steps to create the CNN model, there are additional steps to define training and evaluation, execute the model and tune it – see our full guide to TensorFlow CNN.

 

Just below, we provide three quick tutorials that can help you get hands-on with TensorFlow image classification.

Scaling Up Image Classification on TensorFlow with MissingLink

If you’re working on image classification, you probably have a large dataset and need to run your experiments on several machines. This can become challenging, and you might find yourself spending serious time setting up machines, copying data and troubleshooting.

 

MissingLink is a deep learning platform that lets you effortlessly scale TensorFlow image classification models across many machines, either on-premise or in the cloud. It also helps you manage large data sets, manage multiple experiments, and view hyperparameters and metrics across your entire team on one pane of glass.

 

Learn more and see how easy it is.


Quick tutorial #1: TensorFlow Image Classification with Transfer Learning

Modern image recognition models use millions of parameters. Training them from scratch demands labeled training data and hundreds of GPU-hours or more of computer power. Transfer learning provides a shortcut, letting you use a piece of a model that has been trained on a similar task and reusing it in a new model.

Here, we will reuse the feature extraction abilities from image classifies trained on ImageNet, and train an additional classification layer. We will use the image feature extraction module trained on ImageNet.

 

Prerequisites: Install tensorflow-hub, and a recent version of TensorFlow.

 

The following steps are summarized, see the full tutorial on TensorFlow Hub.

1. Training the transferred model on our images

TensorFlow provides an example archive of flower photos you can use to get started. To access these photos, run:

cd ~
curl -LO http://download.tensorflow.org/example_images/flower_photos.tgz
tar xzf flower_photos.tgz

 

Then download the following code from GitHub:

mkdir ~/example_code
cd ~/example_code
curl -LO https://github.com/tensorflow/hub/raw/master/examples/image_retraining/retrain.py

 

For the most basic cases the retrainer can be run as follows:

python retrain.py --image_dir ~/flower_photos

 

This script trains a new classifier on top and loads the pre-trained module for the flower photos. The flower types were not in the initial ImageNet classes the network trained on.

 

2. Bottlenecks

The initial phases analyze the images on disk and caches and calculate their bottleneck values. ‘Bottleneck’ refers to the layer before the final output layer. The final retraining succeeds in new classes because the type of information required to distinguish between all the 1,000 classes in ImageNet is also useful when distinguishing between new types of objects.

Every image is reused many times during training so you can cache these bottleneck values on disk. By default, they are kept in the /tmp/bottleneck directory.

 

3. Training

Training the top layer of the network starts after the bottlenecks are complete. You will see step outputs, training accuracy, validation accuracy, and cross entropy values.

 

This script will run 4,000 training steps. Each step selects ten images randomly from the training set, identifies their bottlenecks from the cache, and directs them into the final layer to generate predictions. Predictions are compared to the actual labels to update the weights of the final layer via the back-propagation process (see our in-depth guide on backpropagation).

 

Accuracy improves as the process evolves. After all the steps are complete, a final test accuracy evaluation is conducted on a separate series of images.

 

4. Using the Retrained Model

The script will write the model trained on your categories to:

/tmp/output_graph.pb

 

And a text file with the labels to:

 

/tmp/output_labels.txt

 

The model includes the TF-Hub module inlined into it and the classification layer. The two files are in a format that the C++ and Python image classification example can read.

 

You replaced the top layer, so you need to create a new name in the script, for example using the flag --output_layer=final_result if you’re using label_image.

 

Here’s an example of how to run the label_image example with the retrained model. TensorFlow Hub modules accept inputs with color values in the range [0,1], so there is no need to set --input_mean or --input_std flags.

curl -LO https://github.com/tensorflow/tensorflow/raw/master/tensorflow/examples/label_image/label_image.py
	python label_image.py \
	--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
	--input_layer=Placeholder \
	--output_layer=final_result \
	--image=$HOME/flower_photos/daisy/21652746_cc379e0eea_m.jpg

 

You should see flower labels listed, typically with a daisy on top. You can substitute the --image parameter with your own images.

 

5. Training on your own categories

Once the script works successfully on the flower example images, you can teach your network to recognize other categories.


Quick Tutorial #2: Classifying Dog Images with ResNet-50

ResNet is an ultra-deep CNN structure that can run up to thousands of convolution layers. ResNet-50 is a specific variant that creates 50 convolutional layers, each processing successively smaller features of the source images.

 

By the end of this quick tutorial #2, you will have created code that will accept an input image and return an estimation of the breed of a dog. If a human face is identified, the algorithm will estimate the dog breed that resembles the face.

 

The following steps are summarized, see the full tutorial by Hamza Bendemra.

 

1. Setting up the building blocks for the algorithm

 

To create our algorithm, we will use TensorFlow, the OpenCV computer vision library and Keras, a front-end API for TensorFlow.

 

2. Detecting if an image contains a human face

To see if the image is a human face, we will use an OpenCV Face Detection algorithm. First, convert the images to grayscale. The detectMultiScale function executes the classifier stored in face_cascade and takes the grayscale image as a parameter.

 

import cv2
import matplotlib.pyplot as plt
%matplotlib inline

 

The following lines of code extract a pre-trained face detector and provide the value “True” if the function identifies a face.

 

face_cascade = cv2.CascadeClassifier('haarcascades/haarcascade_frontalface_alt.xml')
	def face_detector(img_path):
	    img = cv2.imread(img_path)
	    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
	    faces = face_cascade.detectMultiScale(gray)
	    return len(faces) > 0

 

3. Detecting if an image contains a dog

 

To see if the image contains a dog face, we will use a pre-trained ResNet-50 model using the ImageNet dataset. This pre-trained ResNet-50 model provides a prediction for the object in the image.

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image                  
from tqdm import tqdm

ResNet50_model = ResNet50(weights='imagenet')

def path_to_tensor(img_path):

img = image.load_img(img_path, target_size=(224, 224))

 

For the final prediction, we get an integer that relates to the predicted object class of the model by taking the argmax of the predicted probability vector, which we can recognize with an object category via the ImageNet labels dictionary.

 

4. Build your CNN classifier with transfer learning

To minimize training time and retain accuracy, we will be training a CNN using transfer learning. By retaining the early layers and training newly added layers, we can use the knowledge acquired by the pre-trained algorithm. Keras has several pre-trained deep learning models used for prediction, fine-tuning and feature extraction.

 

5. Model Architecture

We will create our model architecture so that the last convolutional output of ResNET50 becomes input in our model. Add a Global Average Pooling layer. Also, add a Fully Connected Layer that has one note for each dog category and has a Softmax activation function.

 

6. Compile and test the model

Use the CNN to test how accurately it identifies breed in our test dataset. Fine-tune the model by going through 20 iterations.

 

Resnet50_model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
checkpointer = CodelCheckpoint(filepath='saved_models/weights.best.ResNet50.hdf5', 
                               verbose=1, save_best_only=True)
Resnet50_model.fit(train_DogResnet50, train_targets, 
          validation_data=(valid_DogResnet50, valid_targets),
          epochs=20, batch_size=20, callbacks=[checkpointer])

 

Next, load the model weights with the validation loss and calculate the classification accuracy in the test data.

 

Resnet50_model.load_weights('saved_models/weights.best.ResNet50.hdf5')
Resnet50_predictions = [np.argmax(Resnet50_model.predict(np.expand_dims(feature, axis=0))) for feature in test_DogResnet50]
test_accuracy = 100*np.sum(np.array(Resnet50_predictions)==np.argmax(test_targets, axis=1))/len(Resnet50_predictions)
print('Test accuracy: %.4f%%' % test_accuracy)

 

7. Predict dog breed with the model

Having developed the algorithm, we can write a function that uses an image path as input and outputs the dog breed predicted by our model.

 

from extract_bottleneck_features import *
def dog_breed(img_path):
  = extract_Resnet50(path_to_tensor(img_path))
predicted_vector = Resnet50_model.predict(bottleneck_feature)
return dog_names[np.argmax(predicted_vector)

 

8. Testing our CNN Classifier

 

Write a function that determines whether the image contains a dog, human or neither.

 

If a dog is detected, provide the predicted breed. If a human is detected, provide the resembling dog breed. If neither is detected, provide an error message.

 

def dog_breed_predictor(img_path):
  breed = dog_breed(img_path) 
  img = cv2.imread(img_path)
  cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(cv_rgb)
plt.show()
  if dog_detector(img_path):
    print("This is a dog and its breed is: " + str(breed))
  elif face_detector(img_path):
    print("This is human but it looks like a: " + str(breed))
  else:
    print("I don't know what this is.")

 


Quick Tutorial #3: Classifying Flower Images with Google Inception

This tutorial shows how to classify a database of 7,000 flower images using Google Inception. Inception is an image classifier which Google built and outsourced. It was trained on a staggering 1.2 million images from a thousand different categories for two weeks at a time on some of the fastest machines in the world. Inception’s architecture is shown below.

Google Inception Image Classifier Architecture

Image Source: Google Cloud Platform

 

The following tutorial steps are summarized, see the full tutorial by Amitabha Dey.

 

1. Download training images and scripts

 

Begin by downloading the training images for your classifier. These will consist of the images that you require your classifier to recognize. Keep them labeled in separate folders, as the folder_names are judged as the label for the photos they hold.

For this example, download images of 5 kinds of flowers with over 7000 images for each kind.  Download images here.

 

Clone the project’s GitHub repository. Copy the flower_photos folder with your training images in the tf_files folder of the repository.

 

2. Retrain the network

Train the final layer of our network. The following directory retains the cache of all the bottleneck values:

 

--bottleneck_dir=tf_files/bottlenecks

 

The following commands point to the directories of the scripts:

 

--model_dir=tf_files/models/"${ARCHITECTURE}" \
--summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \
--output_graph=tf_files/retrained_graph.pb \
--output_labels=tf_files/retrained_labels.txt \

 

Lastly, add the directory of our training images:

 

--image_dir=tf_files/flower_photos

 

3. Classify images

Having trained your classifier, you can now test it. Download a new image or select an image from the training images. Call our label_image script. Use the following command to classify the image:

 

python -m scripts.label_image \
--image=tf_files/flower_photos/daisy/test_image.jpg

 

The output should look like this:

image classifier output

You will get a readout of all the categories with their confidence score. The above shows that the test_image is a daisy with ~99% confidence.


TensorFlow Image Classification in the Real World

In this article, we explained the basics of image classification with TensorFlow and provided three tutorials from the community, which show how to perform classification with transfer learning, ResNet-50 and Google Inception. When you start working on real-life CNN projects to classify large image datasets, you’ll run into some practical challenges:

  • tracking experiments

    Tracking Experiments

    Tracking experiment source code, configuration, and hyperparameters. There are many CNN architectures and you’ll need to discover which one suits your needs, and fine tune it for your specific dataset. You’ll probably run hundreds or thousands of experiments to discover the right hyperparameters. Organizing, tracking and sharing data for all those experiments is difficult.

  • running experiment across multiple machines

    Scaling up your experiments

    Image classification models are computationally intensive, and you’ll need to scale experiments across multiple machines and GPUs. Provisioning those machines, whether you have to install on-premise machines or set up machine instances in the cloud, and ensuring the right experiments run on each machine, takes serious time.

  • manage training datasets

    Manage training data

    Image and video classification projects typically involve large and sometimes huge datasets. Copying these datasets to each training machine, then re-copying it when you change project or fine tune the training examples, is time-consuming and error-prone.

 

MissingLink is a deep learning platform that does all of this for you, and lets you concentrate on building the most accurate model. Learn more to see how easy it is.

Train Deep Learning Models 20X Faster

Let us show you how you can:

  • Run experiments across hundreds of machines
  • Easily collaborate with your team on experiments
  • Reproduce experiments with one click
  • Save time and immediately understand what works and what doesn’t

MissingLink is the most comprehensive deep learning platform to manage experiments, data, and resources more frequently, at scale and with greater confidence.

Request your personal demo to start training models faster

    Thank you!
    We will be in touch with more information in one business day.
    In the meantime, why not check out how Nanit is using MissingLink to streamline deep learning training and accelerate time to Market.