I recently had the great fortune of presenting a lunch & learn session to the Capsule8 team. In this presentation I discussed how to effectively leverage machine learning to build intelligent products as efficiently as possible. Rather than focus on a single type of audience, I included information relevant to multiple levels including executive leadership, middle management, and individual contributors building machine learning solutions. The slides for the talk can be accessed here.
Below is a brief write up of the talk.
The Main Question
If you’re a leader at an organization seeking to leverage machine learning within your company, the main question to focus on is how to deliver value through machine learning as efficiently as possible.
It’s easy to get caught up in the hype surrounding AI, especially if you follow popular tech media outlets. Tech juggernauts like Google, Amazon, and Facebook have realigned themselves as AI-first companies. Startups promising machine learning enabled scale are raising enormous amounts of venture capital. But from my experience building ML powered products as an individual contributor, manager, and consultant, I can confidently say that much of what’s written online about AI is overly hyped.
I’d like to share a 5-step process for maximizing business results through the use of machine learning and AI. Before we dig into the individual steps, let’s examine a well-known product powered by ML.
Smart Compose is a feature in Gmail that uses machine learning to interactively offer sentence completion suggestions as you type. Although the feature doesn’t generate revenue for Google directly, it does provide a magical user experience that no other email client offers today. And it’s the Smart Compose experience, not the machine learning model behind Smart Compose, that generates value for Google.
When Smart Compose generates text to finish your sentence, the predicted words appear in light gray font to distinguish the predictions from what you’ve written. The predictions are suggested, but you’re not forced to use them. If you decide to accept the suggestions, you can integrate the words by swiping right on mobile or hitting Tab on your laptop. This experience is like magic! It feels totally natural. Imagine how frustrating it would be if instead of suggesting words, Smart Compose automatically filled in the text with its predictions and you had to manually delete all the incorrect ones. Or if designers chose to introduce an additional button to accept suggestions.
On the right-hand side of the slide we see the neural network architecture that powers smart compose by predicting the next few words given an initial phrase. This is where data scientists spend the bulk of their time. But it’s not where the value generation is realized. The cost and time behind the data scientists effort is worth its weight in gold many times over when you’re able to deliver an experience like the Smart Compose feature.
Remember to keep this dichotomy in mind. An intelligent experience isn’t possible without intelligence, typically in the form of a trained machine learning model. But on its own that intelligence doesn’t generate user value until it’s delivered in the form of a product or feature through an experience. Always remember there’s a difference between machine learning powered products and machine learning models.
The 5 Step Process
1. Focus on formulating the business problem
The first step in delivering value is to focus on the business problem you wish to solve. This is more product management than it is machine learning. Can you clearly state the problem? Who are the users whose lives you want to improve? Do these users want whatever you’re planning to build? Objectively answering this last question requires conducting user research through interviews, surveys, and other means.
While this first step is part of any product ideation process, it’s important to keep in mind that machine learning enables entirely new sorts of experiences than traditional software development. Hence a product manager for AI does everything a traditional PM does, and much more.
It’s important to ask yourself whether you can build the experience without machine learning before committing to building the intelligence. If you don’t need a complex model, you’ll avoid significant capital investment and deliver user value much faster. Even if you do need intelligence, starting with a simple solution lets you gather customer feedback incrementally and collect more data, which leads to better models. For example, Smart Compose built off of Smart Reply, a feature that suggested replies to emails.
2. Assemble a team
Once you’ve clearly formulated the problem and decided that intelligence is needed, you need to assemble a team. Data scientists can’t build products on their own. There is a false belief floating around that if you hire a data scientist with a PhD, this person can do everything required to build an intelligent product. Meanwhile in reality this PhD only knows how to write matlab code. I’m being facetious here but my point is that you need to assemble a team with diverse skills in order to build a great product.
Deepak Agarwal, the VP of Artificial Intelligence at LinkedIn, made this same point in his keynote at TWIMLcon 2019:
At LinkedIn, we have a machine learning engineer sitting with product designers at the design stage when building new products. If you want to get the AI right these different roles need to work together from the planning stage all the way through implementation.Deepak Agarwal, VP of Artificial Intelligence at LinkedIn
And it’s not enough just to include folks like designers. Hussein Mehanna, the Head of AI at self-driving car company Cruise, sas that in order to attract and keep high quality UX designers on ML projects you need to treat them as first class citizens along with your data scientists and machine learning engineers.
Deepak Agarwal summarized it perfectly: “It takes a village to get AI right.”
3. Quantify the cost of a model error
After defining the problem defined and assembling a team, it’s time to begin scoping the technical work. For a machine learning problem, the most important part of this preparation is planning for incorrect predictions. Machine learning is inherently non-deterministic. Data scientists work hard to estimate how models will perform on unseen data, but it’s impossible to plan for every combination of input data that a model might see. Instead, the best thing to do is quantify the cost of a model error.
For instance, a binary classifier can produce two types of errors: false positives and false negatives. The cost of each type of error is domain dependent. A song recommender that incorrectly guesses you like a particular song is a nuisance. A medical diagnostic that predicts a patient has cancer and requires chemotherapy is life altering.
When possible teams should assign dollar values to each type of error and factor these into the model’s loss function. During the model building process, data scientists move beyond aggregate metrics and perform error analysis to assess performance on critical subpopulations in the data.
4. Build out and automate the delivery pipeline
Data scientists are driven by their desire to build the most accurate model possibly. This often translates into training increasingly complex models. Avoid this temptation and start with basic models.
In his popular Rules of Machine Learning: Best Practices for ML Engineering, Google engineer Martin Zinkevich advises to “Keep the first model simple and get the infrastructure right” (Rule #4). ML systems require pipelines for extracting, transforming, and loading data, oftentimes in real time with strict SLAs. It’s vital to ensure that these pipelines are robust before introducing other sources of uncertainty like model complexity. And like all software systems, these data pipelines need to be well tested. As Zinkevich states in his next rule “Test the infrastructure independently from the machine learning.”
Beyond moving data through pipelines, ML systems need to deploy models in a continuous manner. Deployment is a multi-step process with its own set of challenges, including the need to A/B Test models to ensure predictions are driving the right product metrics.
5. Monitor, monitor, monitor
Once you’ve deployed your models to production, the real work begins. Models must be continuously monitored to detect and combat deviations in model quality such as concept drift. Early and proactive detection of these deviations enables you to take corrective actions, such as retraining models, auditing upstream systems, or fixing data quality issues without having to manually monitor models or build additional tooling.
Standard tools for monitoring software systems are not sufficient for monitoring machine learning systems. Besides standard software metrics like system uptime and latency, monitoring ML systems requires tracking input output data and model accuracy metrics. What good is a model that’s operational 99.999% of the time but returns inaccurate predictions?
Building machine learning products is a relatively new endeavour for many companies. The tooling landscape is changing quickly and new best practices are emerging every day. If you’re planning to build intelligent products, one piece of advice is to start small. Avoid lofty goals and work on small projects that help create momentum within your organization and build confidence amongst your team. Major impact requires major investment (millions of dollars). Starting small will help you learn iteratively along the way to big impact.
Luigi Patruno is a Data Scientist and the Founder of MLinProduction.com. He’s currently the Director of Data Science at 2U, where he leads a team of data scientists and ML engineers in developing machine learning models and infrastructure to predict student success outcomes. Luigi founded MLinProduction.com to educate data scientists, ML engineers, and ML product managers about best practices for running machine learning systems in production. As a consultant for Fortune 500s and start-ups, Luigi helps companies utilize data science to create competitive advantages. He’s taught graduate level courses in Statistics and Big Data Engineering and holds a Masters in Computer Science and a BS in Mathematics.