Skip to content

Integration with a Scikit-learn Project

This topic shows you how to integrate the MissingLink SDK with a scikit-learn project.

The following steps are covered:

  • Instantiate an SkLearnProject.
  • Create a new experiment.
  • Give the experiment a name and description.
  • Define the different stages of the experiment in a context.
  • Add a test scope.

Note

You can also consider trying the step-by-step tutorial for integrating the MissingLink SDK with an existing scikit-learn example.

Preparation

  • You must have scikit-learn installed in the same working environment that MissingLink SDK is installed. The SDK doesn't enforce scikit-learn as one of its dependencies.

    Use pip install scikit-learn to get the latest version.

  • You must have created a new project. If not, follow the instructions in Creating a project.

Note

Ensure that you can successfully run the basic mnist.py training script. In the steps that follow below, the basic script is integrated with the MissingLink SDK to enable remote monitoring of the training, validation, and testing process.

Compare the basic script with the integrated script.

Write code

  1. Import the SDK and define your credentials at the beginning of the file (before any function definition).

    import missinglink
    
  2. Instantiate an SkLearnProject before defining any function:

    project = missinglink.SkLearnProject()
    
  3. Define the different stages of the experiment in a context.

    with project.train(model) as train:
      print("fit")
      model.fit(data_train, target_train)
      data_train_pred = model.predict(data_train)
      accuracy = accuracy_score(target_train, data_train_pred)
    
      // Report the accuracy metric (optional)
      train.add_metric('accuracy', accuracy)
      print("Training set accuracy: %f" % accuracy)
    

    Note

    At this point, you have performed the train stage of running the experiment. Running the test stage is optional and can be done in the next step:

  4. (Optional) Scikit-learn provides a test scope. By adding add_test_data, you are able to see a greatly enhanced visual and normalized confusion matrix. Open the Test tab of your experiment.

    with project.test() as test:
      print("test")
      data_test_pred = model.predict(data_test)
      accuracy = accuracy_score(target_test, data_test_pred)
    
      // Enable ML confusion matrix
      test.add_metric('accuracy', accuracy)
      test.add_test_data(target_test, data_test_pred)
      print("Test set accuracy: %f" % accuracy)
      print("Confusion matrix:")
      print(confusion_matrix(target_test, data_test_pred))
    


You should have integrated MissingLink's SDK successfully.

  • Inspect the resulting integrated script.
  • Run the new script and see how the MissingLink.ai dashboard helps with monitoring the experiment. A description follows.

Web dashboard monitoring

You can monitor your experiment on your MissingLink dashboard.

Click on the experiment to view your metric graphs.

Next steps

Learn more about integrating with scikit-learn to enable the following MissingLink features: