Skip to content

Cloning Data Volumes

In this section, you'll learn how to clone data from your data volumes on MissingLink.

You can also do this using native iterators.

  1. Click Copy Clone Command in the query page.

    Step 1

  2. Clone the specific queried data.

    Note that whenever you use the special commands denotated by the $ sign, the query string must be within single quotes. It is recommended to have the whole query within single quotes. If you need to introduce spaces in values that you supply to us within the queries or destination path, it is recommended that you do so by wrapping them within double quotes to avoid conflicts or errors.

    ml data clone  --query 'queryString' \
        --dest destinationPath
    

    The implicit command above translates to the following explicit command (the reserved keywords such as $phase and $name behind the destinationPath in the destinationPath string as detailed below):

    ml data clone  --query 'queryString' \
        --destFolder 'destinationPath' --destName 'destinationFileName'
    

    Note

    The clone command must be executed on the specific machine where you wish to access the cloned data. If you move to another machine, you must execute the command again on the other machine to gain access to the cloned data.

Flags for cloning data from your data volume

Run the following command for viewing the flags available for the command:

ml data clone --help

There are several system variables that the MissingLink CLI clone command can translate automatically. These keywords can be used in the --destFolder and --destName flags.

For more information, see System variables with special meaning for cloning.

Examples

For the purpose of the examples, the dataset contains data points with a single attribute in the metadata named type_of_animal that has the values: Dog, Cat, and Fish.

1) Running the following command:

ml data clone  --query '@version:versionID 
    AND @sample:0.2 AND @split:0.5:0.25:0.25 @seed:1337' \
    --destFolder '/destinationPath/$@/'
creates three folders under the destinationPath named train, test, and validation and copies the data points according to the @split ratio to each folder.

2) Running the following command:

ml data clone  --query '@version:versionID
    AND @sample:0.2 AND @split:0.5:0.25:0.25 @seed:1337' \
    --destFolder '/destinationPath/$@/' --destFile '$name' 

generates the original filename for each data point copied.

3) Running the following command:

ml data clone  --query '@version:versionID
    AND @sample:0.2 AND @split:0.5:0.25:0.25 @seed:1337' \
    --destFolder '/destinationPath/$@/$dir' --destFile '$name' 

creates subfolders with the original folder structure of the data points from the sync command under the folders of train, test, and validation.

4) Running the following command:

ml data clone  --query '@version:versionID
    AND @sample:0.2 AND @split:0.5:0.25:0.25 @seed:1337' \
    --destFolder '/destinationPath/$@/$type_of_animal' --destFile '$name' 
creates 'Dog', 'Cat' and 'Fish' subfolders under the train, test, and validation folders and copies the relevant data points for each subfolder according to the type_of_animal attribute.