Skip to content

Integration with a scikit-learn Project

This topic shows you how to integrate the MissingLink SDK with a scikit-learn project.

The following steps are covered:

  • Instantiate an SkLearnProject.
  • Create a new experiment.
  • Give the experiment a name and description.
  • Define the different stages of the experiment in a context.
  • Add a test scope.


You can also consider trying the step-by-step tutorial for integrating the MissingLink SDK with an existing scikit-learn example.


  • You must have scikit-learn installed in the same working environment that MissingLink SDK is installed. The SDK doesn't enforce scikit-learn as one of its dependencies.

    Use pip install scikit-learn to get the latest version.

  • You must have created a new project. If not, follow the instructions in Creating a project.


Ensure that you can successfully run the basic training script. In the steps that follow below, the basic script is integrated with the MissingLink SDK to enable remote monitoring of the training, validation, and testing process.

Compare the basic script with the integrated script.

Write code

  1. Import the SDK and define your credentials at the beginning of the file (before any function definition).

    import missinglink
  2. Instantiate an SkLearnProject before defining any function:

    project = missinglink.SkLearnProject()
  3. Define the different stages of the experiment in a context.

    with project.train(model) as train:
      print("fit"), target_train)
      data_train_pred = model.predict(data_train)
      accuracy = accuracy_score(target_train, data_train_pred)
      // Report the accuracy metric (optional)
      train.add_metric('accuracy', accuracy)
      print("Training set accuracy: %f" % accuracy)


    At this point, you have performed the train stage of running the experiment. Running the test stage is optional and can be done in the next step:

  4. (Optional) Scikit-learn provides a test scope. By adding add_test_data, you are able to see a greatly enhanced visual and normalized confusion matrix. Open the Test tab of your experiment.

    with project.test() as test:
      data_test_pred = model.predict(data_test)
      accuracy = accuracy_score(target_test, data_test_pred)
      // Enable ML confusion matrix
      test.add_metric('accuracy', accuracy)
      test.add_test_data(target_test, data_test_pred)
      print("Test set accuracy: %f" % accuracy)
      print("Confusion matrix:")
      print(confusion_matrix(target_test, data_test_pred))

    By adding add_test_data, you are able to see a greatly enhanced visual and normalized confusion matrix

You should have integrated MissingLink's SDK successfully.

  • Inspect the resulting integrated script.
  • Run the new script and see how the MissingLink dashboard helps with monitoring the experiment. A description follows.

Web dashboard monitoring

You can monitor your experiment on your MissingLink dashboard.

monitor your scikit-learn experiment on your MissingLink dashboard

Click on the experiment to view your metric graphs.


Scikit-learn only exposes the end result and no metrics during the training process itself. This is the reason MissingLink can only display one data point on the metric chart, as can be seen here.

Click on the scikit-learn experiment to view your metric graphs

Next steps

Learn more about integrating with scikit-learn to enable the following MissingLink features: