Skip to content

Integration with PyTorch (With Epochs and Batches)

This topic shows you how to integrate the MissingLink SDK with a PyTorch multilayer perception neural network that is trained on the MNIST dataset.

The example shows how to work with epochs and batches using nested loops, using experiment.epoch_loop in conjunction with experiment.batch_loop.

The following steps are covered:

  • Define a project callback with your credentials.
  • Create a new experiment.
  • Define an experiment context.
  • Change the loop.
  • Define a validation context.
  • Define a testing context.

Note

You can also consider trying the step-by-step tutorial for integrating the MissingLink SDK with an existing PyTorch example.

Preparation

  • You must have PyTorch installed in the same working environment that MissingLink SDK is installed. The SDK doesn't enforce PyTorch as one of its dependencies.

  • You must have created a new project and have its credentials: owner_id and project_token ready. Otherwise, follow the instructions in Creating a project.

Note

Ensure that you can successfully run the basic mnist.py training script. In the steps that follow below, the basic script is integrated with the MissingLink SDK to enable remote monitoring of the training, validation, and testing process.

Compare the basic script with the integrated script.

Write code

  1. Import the SDK and define your credentials at the beginning of the file (before any function definition).

    import missinglink
    
    OWNER_ID = 'Your owner id'
    PROJECT_TOKEN = 'Your project token'
    
  2. Now create a PyTorchProject instance with your credentials, which helps to monitor the experiment in real time. In the run_training function and before the training loop, add the following statements.

    missinglink_project = PyTorchProject(OWNER_ID, PROJECT_TOKEN)
    
  3. First we create a new experiment as the outermost context, wrapping around the training loop. You can provide the experiment with a name and description. Add the following statement right before the training loop.

    with missinglink_project.create_experiment(
        model,
        metrics={'loss': loss},
        display_name='MNIST multilayer perception',
        description='Two fully connected hidden layers') as experiment:
    

    Parameter descriptions

    • model: Reference to the model object
    • metrics: Dictionary of all the metrics that will be tracked during the experiment
    • display_name (optional): Experiment name
    • description (optional): Experiment description
  4. The metrics you gave the experiment now have a wrapped version in experiment.metrics. In order for the metrics to be monitored, you need to call the wrapped version instead of the original. After creating the experiment, write:

    loss_function = experiment.metrics['loss']
    

    You should treat this new loss_function just as if it were the original. Invoke it inside your training loop, your validation loop, and your test loop.

  5. Within the experiment context, change the for loop to use experiment.epoch_loop generator and experiment.batch_loop generator instead of range function.

    for epoch in experiment.epoch_loop(epochs=10):
        # `train_loader` is your `torch.utils.data.DataLoader` that loads the train data.
        for batch, (data, target) in experiment.batch_loop(iterable=train_loader):
            data, target = Variable(data), Variable(target)
    

    The iterable argument can be any iterable you wish, like a list, a file, a generator function, etc. When used with the iterable parameter, epoch_loop and batch_loop yield the index of the step and the data from the iterable.

    Note

    Additional implementations of epoch_loop and batch_loop

    • Fixed number of epochs/batches

    epoch_loop and batch_loop can also run according to a set number of iterations, using the epochs parameter for epoch_loop and the batches parameter for batch_loop:

        for epoch in experiment.epoch_loop(epochs=10):
            # Perform a training epoch
    
    • Use lambda condition

    There is an optional parameter, condition that can be added here to augment the way the steps are run.

    For example, if you change the above statement to the following:

        loss_value = 0.55
    
        for epoch in experiment.epoch_loop(condition=lambda _: loss_value > 0.5):
    

    Note that this is not the actual loss value, it's a variable that we have created as an example and the following will run the training as long as the loss value is more than 0.5%.

  6. Add the experiment.validation context around the call to the validation function.

    with experiment.validation():
        validation()
    
  7. Add the experiment.test context around the call to the test function.

    Choose one of the following ways to use the test context.


    Option 1: Using a DataLoader

    If you use a torch.utils.data.DataLoader to load the test data, use it so:

    if step % test_interval == 0:
        with experiment.test(model, test_data_object=test_loader):
            test()
    

    Parameter descriptions

    • model: The tested model.
    • test_loader: The torch.utils.data.DataLoader that loads the test data.


    Option 1: Using an Iterator

    If you use one of the torchtext.data iterators to load the test data, use it so:

    with experiment.test(model, test_data_object=iterator, target_attribute_name='label'):
        test()
    

    Parameter descriptions

    • model: The tested model.
    • iterator: The torchtext.data iterator that iterates over the data. Can be an Iterator, a BucketIterator, or a BPTTIterator.
    • target_attribute_name: The attribute name of the target of every batch, so that batch.target_attribute_name is the target. Defaults to 'label'.


    Option 3: Test manually

    Otherwise, use the test context manually:

    with test(model, test_iterations=1000):
        test() # call here `test_iterations` times to `confusion_matrix`
    

    Parameter descriptions

    • model: The tested model.
    • test_iterations: The number of test iterations (batches) that are going to be performed.

    Use the context with the tested model and with the number of test iterations. Then, inside the testing function, call experiment.confusion_matrix(output, target) test_iterations times:

    Parameter descriptions

    • output: A 2D torch.autograd.Variable. The output of the model for a single test batch.
    • target: A 1D torch.autograd.Variable or Array-Like. The targets (labels) of a single test batch.


You should have integrated MissingLink's SDK successfully.

  • Inspect the resulting integrated script.
  • Run the new script and see how the MissingLink.ai dashboard helps with monitoring the experiment. A description follows.

Web dashboard monitoring

You can monitor your experiment on your MissingLink dashboard.

Click on the experiment to view your metric graphs.

Next steps

Learn more about integrating with PyTorch to enable the following MissingLink features: