Data Version Control
This topic describes data science version control in MissingLink for managing the versions of your data volumes.
When working with datasets, you might need to check out a specific, earlier data volume version, especially when modifications to a dataset did not produce the desirable results.
Creating and committing versions of your data volumes helps you to achieve that.
Each data volume version is immutable and if you want to change the dataset, you have to commit a new version.
Unless otherwise specified, the default version that MissingLink references in each case is the staging version.
Option 1 - Committing a Version Using MissingLink's Web Dashboard
Provide a description for the commit and click Commit.
Option 2 - Committing a Version Using MissingLink CLI
Run the commit version command:
ml data commit yourDataVolumeID --message "your commit message"
After committing a version using MissingLink CLI, a new commit is added to the web dashboard under the data volume.
Parameters for committing a version
Run the following command for viewing the parameters that are available with the
ml data commit command:
ml data commit --help
message: Message that should be attached to the commit.
Running the sync command with the --commit flag
ml data sync yourDataVolumeID --dataPath yourDataPath --commit "your commit message"
Note that the commit takes all uncommitted changes into the same version and not only the changes in the sync command.
Unstaging a version using MissingLink's web dashboard
Approve that you want to unstage your changes.
In the case of an unstage, all of your unstaged data will be lost.