Zencity’s Deep Learning Mission
Zencity develops algorithms to help cities understand citizens better and be proactive about improving their quality of life. The product enables cities to make data-driven decisions with respect to their residents by mining data about what people feel about their city and their surroundings from social media platforms and a myriad of other sources.
They provide a dashboard that lets city managers see what citizens are talking about with regard to the city and its services. The platform is used by over 60 cities, from small cities with a population under 15,000 to large cities like Houston and San Antonio.
The data science team uses deep learning to build algorithms that analyze millions of social media interactions per month per city, including both text and images. Many of the algorithms leverage transfer learning, using existing sentiment analysis models and extending them to understand whether a citizen is satisfied with city services or not.
Zencity has over 60 cities and 90 categories that were trained using LDA. Multiple that to understand how many models we needed.
Lacking an Infrastructure for Large-Scale Deep Learning
Zencity has a team of three data scientists, which for a long time had been spending 30% of its time on automation and DevOps tasks. They developed three Python libraries to help them manage deep learning experiments on the cloud:
- A library to upload and download artifacts to and from the cloud
- A library that manages hierarchical configuration data on cloud machines
- A library to save datasets and experiment artifacts to an Azure blob
Ido Ivri, CTO and co-founder of Zencity, and Dr. Ori Cohen, head of the data science team, faced a major challenge: they lacked an infrastructure which would allow them to run deep learning at scale.
While Zencity attempted to automate the process of uploading and managing data and configuration, they still had to run experiments manually:
- Experiments were executed by hand, with a loop to run the same experiment on multiple city datasets
- Experiments required “babysitting”—each time an experiment ended or failed, a data scientist had to manually run another experiment
- Expensive GPU machines experienced idle time, wasting resources
- When a production model needed to run on some or all cities, it would run sequentially on one machine, which would take 20-30 hours
It was clear to Ido and Ori that they needed to find the right tools that would help them automate and manage DevOps tasks. Their current infrastructure provided only partial automation, and required a heavy investment in development and ongoing maintenance. A true deep learning automation platform would help their team to focus on data science work instead of DevOps.
In addition, parallelization across more machines was a crucial capability which would Zencity to run more experiments, achieve results faster and improve time to market for new models.
About 30% of data scientist time was invested in building infrastructure— which isn’t an optimal use of their time.
Zencity Turns to MissingLink
Why did Ido and Ori decided to stop maintaining their home-grown deep learning infrastructure and instead, turn to MissingLink?
- Provides a ready-made AI infrastructure which would save data scientist time wasted on DevOps tasks
- Enables easily parallelization of experiments across multiple machines, while improving resource utilization
- Provides solid privacy and security which is essential when working with city governments
- Supports Microsoft Azure, enabling easy integration with Zencity clients who are also running on Azure
Zencity Uses MissingLink to Scale Up Experiments by 20X and Eliminate Manual DevOps Tasks
MissingLink allows Zencity data scientists to add experiments to a queue of jobs, and run those jobs transparently on a cluster of machines in the Microsoft Azure cloud. This enabled scaling training from 5to 100 experiments per project.
MissingLink supercharged our training, helping us accelerate from 5 to 100 experiments per project.
MissingLink also lets Ori’s team easily manage experiments. The team runs many different neural network architectures and hyperparameter variations, and in the past there was no central record of experiment results.
Using MissingLink Experiment Management, the team can see exactly which experiments ran, on which dataset, and which was the most successful. They can see experiments run by the entire team on one dashboard, and can also filter and search for specific experiments, and drill down to see detailed metrics.
MissingLink is trying to solve one of the biggest problems in the industry. Data scientists can focus on what they do best instead of doing DevOps.
Why MissingLink Puts City Governments at Ease
MissingLink was designed with data privacy and security in mind. Zencity deals with sensitive personal data taken from social networks, they anonymize data to remove personally identifiable information, and still have to ensure data remains private.
MissingLink ensures the data stays with Zencity and is never accessed or transferred by any third party, including the MissingLink platform. MissingLink doesn’t have direct access to the deep learning datasets—it only runs the experiments, with data always staying within Zencity’s cloud account.
Data privacy is where MissingLink shines, because they don’t touch our data. They only run the experiments.
Another reason Ido and Ori chose MissingLink is its integration with the Microsoft Azure cloud. City governments make heavy use of Azure cloud apps like Office, Outlook, PowerBI and Dynamics 365, and Zencity integrates with these tools.
MissingLink lets Zencity manage clusters of Azure machines, define jobs and automatically run them on the machines. Data scientists can deploy successful experiments with the click of a button.
MissingLink provides two other benefits on Azure:
- Automation—saving time by running multiple experiments in the cloud seamlessly, with no manual labor by data scientists.
- Resource optimization—experiments run on GPU machines which are expensive, and MissingLink utilizes all machines to the max, and shuts them down cleanly when experiments end.
Being able to automatically run more experiments or shut down the virtual machine saves us a lot of money.
Delivering on the Promise of Data-Driven Decisions for Cities
In only a few months, MissingLink helped Ido and Ori from Zencity achieve these results:
- Accelerating training from 5 to 100 experiments per project
- Freeing up 30% of data science team’s bandwidth spent on DevOps
- Saving 30% of costs when running experiments on Azure
Zencity is trying to make the world better, helping cities uncover insights that improve citizen life. MissingLink has helped Ido and Ori supercharge the deep learning process, dramatically improve time to market of new models and product features, and slash costs.
These results show that managing deep learning DevOps manually is simply not a viable option, and that automation and scalability are key capabilities that any AI company should adopt to meet customer expectations.
“MissingLink enables us to supercharge the learning process and deliver our promise of data driven decision making to cities everywhere.