Skip to content

Data Version Control

MissingLink's version control lets you manage the versions of your data volumes.

When working with datasets, you might need to check out a specific, earlier data volume version, especially when modifications to a dataset did not produce the desirable results.

Creating and committing versions of your data volumes helps you to achieve that.

Data volume versions are immutable - if you want to change the dataset, you have to commit a new version. Each time you perform a sync and commit, the new data is appended to the data volume and is assigned a unique ID, which you can use to query later on. Think of this as "Git for your data." Over time, you can easily track any changes to the data by viewing the data volume history.

Staging versions

When you perform a regular sync to the data volume, that is, without explicitly committing it, you create a staging version of the data. The data is shown in the dashboard under the data volume and marked Staging. Staging allows you to review what changes will occur, before you accept the sync and also gives you the opportunity to add a comment to clarify what changes are taking place.

Once you are satisfied with the staging version, you can then commit it as a new version.

Note

Unless otherwise specified, the default version that MissingLink references in each case is the staging version.

Committing a staged version

You can commit a version either by using the MissingLink dashboard, or by running a command from the CLI.

  1. Inside the Staging area, click Commit.

    Click Commit to commit a staged version

  2. Provide a description for the commit and click Commit.

    Provide a description for the commit and click Commit

    The new commit is added to the list of versions.

    The new commit is added to the list of versions

Run the ml data commit command:

ml data commit yourDataVolumeID --message "your commit message"

A new commit is added to the web dashboard under the data volume.

You can also achieve the commit by running the ml data sync command with the --commit flag:

ml data sync yourDataVolumeID --data-path yourDataPath --commit "your commit message"

For a complete description of the commands and the available flags, see the commit and sync commands in the CLI reference.

Note

Note that the commit takes all uncommitted changes into the same version and not only the changes in the sync command.

You can unstage a version that has been staged, but not yet committed.

  1. Inside the Staging area, click Unstage.

    Inside the Staging area, click Unstage to unstage a version

  2. Approve that you want to unstage your changes.

    Note

    In the case of an unstage, all of your unstaged data will be lost.

    Confirm the unstage