AI/ML professionals: Get 500 FREE compute hours with Get it now.

Neural Network Concepts Cover

Neural Network Concepts

Transfer Learning: An Overview

As we strive to create machines that can handle complex tasks, deep learning is increasingly coming into the spotlight. Because many of the tasks we want our machines to perform are intuitive for humans and animals, we are developing processes for machines that mirror the way humans learn. In this article, we will explore one such process, known as transfer learning.

What Is Transfer Learning?


Transfer learning is an approach used in machine learning where a model that was created and trained for one task, is reused as the starting point for a secondary task. Transfer learning differs from traditional machine learning because it involves using a pre-trained model as a springboard to start a secondary task.

This approach mimics the way humans apply knowledge learned for one task to a new task. For example, John Doe learned how to read in first grade. In tenth grade, John used his reading abilities in chemistry class. The knowledge he had gained from the primary task (learning how to read) became the foundation on which he started the secondary task (learning chemistry).

The Benefits of Using Transfer Learning and Deep Learning

Taking a model that has already been trained in a specific field, and reapplying it to another area, has many advantages. Some of the main advantages are listed below.

  1. Less training data—starting to train a model from scratch is a lot of work and requires a lot of data. For example, if we want to create a new algorithm that can detect a frown, we need a lot of training data. Our model will first need to learn how to detect faces, and only then can it learn how to detect expressions, such as frowns. Instead, if we use a model that has already learned how to detect faces, and retrain this model to detect frowns, we can accomplish the same result using far less data.
  2. Models generalize better—using transfer learning on a model prepares the model to perform well with data it was not trained on. This is known as generalizing. Models that were trained using transfer learning are better able to generalize from one task to another because they were trained to learn to identify features that can be applied to new contexts.
  3. Makes deep learning more accessible—working with transfer learning makes it easier to use deep learning. It’s possible to obtain the desired results without being an expert in deep learning, by using a model that was created by a deep learning specialist and applying it to a new problem.

What Are the Types of Transfer Learning?

Domain adaptation

In this approach, a dataset on which the machine was trained is different from (but still related to) the target data set. A good example of this would be a spam email filtering model. Let’s say this model was trained to identify spam email for user A. When the model is then used for user B, domain adaptation will be used, because even though the task is the same (filtering emails), user B receives different types of emails from user A.

Multitask learning

This method involves two or more tasks being resolved simultaneously so that similarities and differences can be leveraged. It is based on the idea that a model that has been trained on a related task can gain skills that improve its ability in the new task.

Going back to our spam email filtering model, let’s say this model is learning what features it should look for when identifying spam mail for user A and user B. Because the users are very different, the model needs to look for different features in order to identify each users’ spam mail. For example, user A is an Italian speaker so an Italian language email should not be a red flag. However, user B is a Chinese speaker, so an email in Italian might be considered a spam feature. While simultaneously learning to identify spam features for user A and B, the model learns that regardless of the language, emails requesting credit card details are more likely to be spam.

Zero-shot learning

This technique involves a model trying to solve a task to which it was not exposed during training. For example, let’s say we are training a model to identify animals in pictures. To identify the animals, the machine is taught to identify two parameters: the color yellow and spots. The model is then trained on multiple pictures of chicks, which it learns to identify because they are yellow but do not have spots, and dalmations, which it knows has spots but are not yellow.

To expand on this example, you may not have pictures of giraffes on which to train the model, but the model knows that giraffes are yellow and have spots. When the model encounters an image of a giraffe, it will be able to identify it, even though it had never seen a giraffe in training.

One-shot learning

This approach requires that a model learns how to categorize an object, after being exposed to it either once or just a few times. To do this, the model leverages information it has about known categories. For example, our animal classifying model knows how to identify a horse. The model is then exposed to a single photo of a zebra, which looks exactly like a horse but has white and back stripes. The model will then be able to classify zebras, without being exposed to additional pictures, because it  transferred knowledge it already had about horses.

Applications of Deep Transfer Learning

Transfer learning has been used extensively in deep learning. Some of the main applications of deep transfer learning are transfer learning natural language processing (NLP), computer vision and speech recognition.

Transfer learning NLP is used to facilitate document identification and other tasks related to textual data. This is particularly challenging, because unlike image data, NLP data is very diverse and unstructured.

Deep learning has been used in tasks relating to computer vision, such as image classification. Much like in the human brain, in this approach the first layers of the neural networks detect edges and shapes and the last layers work on the details.

Speech recognition is another application of deep transfer learning. Models that were developed for English speech recognition have been used to improve models developed for German speech recognition.

Deep Transfer Learning with MissingLink

In this article, we explored transfer learning, how it works and what are some of its applications. As with related deep learning technologies, deep transfer language has a number of exciting real-life uses ranging from image classification to speech recognition. However, setting up your deep learning model can be tricky, especially if you go it alone.

This is where the MissingLink Deep Learning Platform comes in. MissingLink offers the following capabilities, to help you build, train and manage your deep learning projects:

  • Experiment management—you can run and track all of your experiments with ease using MissingLink. The dashboard shows you all the relevant information, from the code to the experiment parameters.
  • Data management—your data is one of your biggest assets and protecting it is important. With MissingLink, you will be in full control of your data. Your data is all stored either in your cloud or on your servers.
  • Resource management—MissingLink’s platform allows you to conveniently scale your deep learning resources, so you can effortlessly scale your training.

You can use MissingLink’s platform to train your models applying deep transfer learning. Learn more and schedule a demo with us.

Train Deep Learning Models 20X Faster

Let us show you how you can:

  • Run experiments across hundreds of machines
  • Easily collaborate with your team on experiments
  • Reproduce experiments with one click
  • Save time and immediately understand what works and what doesn’t

MissingLink is the most comprehensive deep learning platform to manage experiments, data, and resources more frequently, at scale and with greater confidence.

Request your personal demo to start training models faster

    Thank you!
    We will be in touch with more information in one business day.
    In the meantime, why not check out how Nanit is using MissingLink to streamline deep learning training and accelerate time to Market.