Confidential Data

MissingLink understands that your data is your most sensitive and valuable asset.

You can be confident in the MissingLink security protocol, as we lay out in the coming paragraphs. Furthermore, as the MissingLink CLI is written in pure python, you can always make sure what data leaves your computer and when.

Our default approach when developing the Resource Management feature was similar to that in other CI/CD tools: we aimed to work with a single SSH key for all of our Git operations. The key is provided during the initial cloud initialization process, encrypted using your cloud KMS key and the encrypted version is then stored on MissingLink servers. When MissingLink creates a server in your cloud, it grants decryption permissions to the cloud encryption key, decrypts the key and uses it as a default SSH key. Then you allow permissions to the said key in your Git hosting provider, if needed. This approach ensures that we will not be able to access your key while providing us with the SSH key provisioning method inside your cloud.

While this method is standard, it also requires the default organization key to be authorized to access all the relevant repositories in your organization and does not provide support for encrypting sensitive data such as environment variables and docker credentials per job.

To address this drawback, MissingLink provides a two-level encryption mechanism:

  • The default organization key is instead used for asymmetric encryption and as a "default" SSH key. As SSH keys are just RSA asymmetric keys and we already encrypt the private key using your KMS key, we now rely on the public key as the encryption key. As we cannot decrypt data encrypted with the public key, we can store such encrypted data without risking its exposure. All sensitive data is encrypted using your default organization public key before leaving the initiating computer. This ensures the data is only decryptable inside your cloud instances that were launched by the Resource Management feature.
  • MissingLink implements second-layer KMS - we run a lightweight KMS backed by Google's KMS. Every user in every organization has a separate Google KMS key and access to encrypt/decrypt calls for that key require MissingLink authentication for the user. All sensitive data is encrypted using that key before being saved in our databases. Currently, the encryption (of the already encrypted data) is done on our backend but in the future, this will also be moved to the computer that is running the job. When the job starts, it has to authenticate and actively request decryption before accessing the previous layer of encryption - the one encrypted. To visualize: the data, at rest looks like this ML_KMS_ENCRYPT(CUSTOMER_PUBLIC_KEY_ENCRYPT (sensitive data)) where the CUSTOMER_PRIVATE_KEY_DECRYPT requires decryption from your cloud KMS. Through the second level of encryption, we can provide both an additional level of isolation between users (as sensitive data encrypted by one user is not available to another) and a fast and extremely reliable way to expire encrypted information, as we can deprecate and delete specific user keys from our KMS. Either while the key is deprecated, or 24 hours after it is deleted, encrypted data is virtually unusable.

The result of this mechanism is that users can encrypt data when submitting jobs - as we have a means to encrypt sensitive in a manner that we cannot decrypt, and a means to encrypt (pre-encrypted) sensitive data in a manner that other users within the same organization cannot decrypt, we can now collect sensitive data per job. In particular, the user can actively provide an SSH key that will be used for Git (and any SSH) operations, docker credentials and secure environment variables.

Note

All data is encrypted using AES256 envelope encryption. Envelope DEKs (data encryption keys) are encrypted using the native cloud/native Google kms/PKCS1_OAEP with SHA512 hash algorithm.