Skip to content

Graceful Shutdown for TensorFlow

This topic shows you how to gracefully shutdown an experiment. The topic builds on Getting Started for TensorFlow with steps.

The following steps are covered:

  • Create a stopped callback function.
  • Set the stopped callback function to MissingLink's callback.
  • Stop the experiment from MissingLink's dashboard.

Preparation

Go through Getting Started for TensorFlow with steps.

Note

Ensure that you can successfully run the mnist.py training script that resulted from integration with the MissingLink SDK. In the steps that follow below, the script is further developed to include graceful shutdown.

Write code

  1. Create a stopped callback function.

    Right above declaring MissingLink's callback, define your stopped callback:

    def start_new_experiment():
        # Write code here that starts a new experiment
        pass
    
    def log_experiment_to_internal_log():
        # Write code here that logs important information
        # to your internal logs
        pass
    
    def stopped_callback():
        start_new_experiment()
        log_experiment_to_internal_log()
    
    # Create a project manager with credentials to
    # communicate with MissingLinkAI's backend
    missinglink_project = missinglink.TensorFlowProject(OWNER_ID, PROJECT_TOKEN)
    
  2. Set the stopped callback to be called.

    In the base script, modify the declaration of MissingLink's callback:

    # Create a project manager with credentials to
    # communicate with MissingLinkAI's backend
    missinglink_project = \
        missinglink.TensorFlowProject(
            OWNER_ID, PROJECT_TOKEN, stopped_callback=stopped_callback)
    

    You can find the full updated script here.

    You have added the graceful shutdown callback to your experiment. Now run the script and stop it using the dashboard.

  3. Stop the experiment through the dashboard while it is still running.

    Go to the MissingLink dashboard, navigate to the experiment, and click Stop.

    In the MissingLink dashboard, navigate to the TensorFlow experiment, and click Stop

    You are prompted to confirm the action. Click Stop as shown. Confirm the Stop action in the TensorFlow experiment

    You should have successfully executed a graceful shutdown of your experiment. Notice that instead of having an exception raised in the environment where you ran the experiment, Experiment stopped from the web appears.

View the addition in the dashboard

You can see the experiment has been marked Stopped in the dashboard.

the TensorFlow experiment has been marked Stopped in the dashboard