I’m a big fan of PyCharm after years of using other Jetbrains tools such as IntelliJ, Reshaper and more recently Rider. While a lot of the data scientists we talk to are using the command line or Jupyter notebooks, some don’t know about the advantages of using an IDE. Some of the biggest ones that come to mind are project management, code completion, and build in code history, to name a few. So if you are new to using MissingLink in PyCharm, I thought I would show a few tips on how I set up my projects.
First off, you’ll want to create a new project. In this case, I was working on processing the ChestXray14 dataset for my MissingLink Data Volume blog post. Here you can see I have the new project and I’ve created a data folder for the X-ray images and a separate folder for the metadata I needed to generate for the MissingLink Data Volume.
When you are working with a lot of data, such as the 45 gigs of X-ray images in the ChestXray14 dataset, you’ll want to tell Pycharm to ignore the directory. If you don’t do this step, Pycharm tries to process the folder, and this could take a long time.
The next thing I like to do is create a self-contained dev environment for my projects. As you can see, I am setting up a custom virtual environment (venv) and using Python3 for the interpreter.
You can access this menu by clicking on the Project Interpreter icon at the bottom of the window.
After you create the new virtual environment, you’ll want to go into your preferences and make sure you are using the new virtual environment and interpreter you created for this project.
At this point, you can go ahead and start installing any python libraries you may need from PyCharm’s built-in terminal or a requirements file. Here I am using
pip install missinglink -U to install the MissingLink SDK.
When the installation finishes, you’ll need to authenticate it. However, before that step, you’ll need to make one more change to the project’s preferences. Here you’ll see I am going into the console settings and have checked “Use existing console for ‘Run with Python console’.”
Running in the Python Console allows you to interact with the terminal window, which is critical when authenticating the MissingLink SDK. It’s also essential when running an experiment and selecting the project it should run in.
Finally, that’s all you need to do to have a new virtual environment ready to work within Pycharm with MissingLink’s SDK. Once you’ve configured it, you can perform any of the standard SDK commands in the terminal window.
What We’ve Learned
In this post, we covered how to configure PyCharm and the MissingLink SDK by doing the following:
- Creating a new project.
- Configuring the virtual environment.
- Installing the MissingLink SDK in PyCharm’s terminal window.
- Enabled “Run with Python console” to interact with MissingLink menu options in the terminal window.
In a future post, we’ll go into more detail about how to use PyCharm to run code to leverage Experiment Management to track real-time progress from the MissingLink’s Web dashboard.