top of page

Managing Machine Learning Projects Wk 1 - Identifying Opportunities for Machine Learning

  • Writer: Muxin Li
    Muxin Li
  • Jun 17, 2024
  • 6 min read

Updated: Jun 25, 2024

Welcome to Managing Machine Learning Projects! This begins the 2nd out of 3 courses in the AI Product Management Specialization series. Learn why most machine learning projects fail and how to de-risk with small experiments - many of these follow the same process Product Managers use in discovery to validate product ideas.


What's Covered:

  • Finding good machine learning problems


Key Takeaways

  • Find real problems that have a large impact

  • Machine Learning does not solve all problems well, and it needs relevant quality data

  • Employ good old fashioned product management principles in problem definition and idea discovery

  • Start with a simpler set of rules (heuristics) before machine learning


Technical terms:

  • Heuristic

  • Augmenting



 

Managing Machine Learning Projects covers what potential pitfalls to avoid and how to set up an ML project for success.


At the conclusion of this course, you should be able to:

1) Identify opportunities to apply ML to solve problems for users

2) Apply the data science process to organize ML projects

3) Evaluate the key technology decisions to make in ML system design

4) Lead ML projects from ideation through production using best practices


Below is an overview of each week:


Identifying Opportunities

Most machine learning projects fail because they didn't tackle a good enough problem to solve that can be improved by machine learning in a cost effective way.

  • 87% of machine learning projects fail

Sometimes it's due to technical challenges - but most of the time it's because of:

  • Pressure to use AI in any way possible

  • Jumping right into model building instead of problem definition and validation


Really, you can argue that many AI projects fail because of the same reasons many products fail.


3 Questions to Ask Before Starting an AI Project

  1. Is there a real user problem? This is usually where things fail. Internal pressure pushes initiatives without first validating there's a market need or problem to solve.

  2. Is Machine Learning a good fit for solving the problem? Machine Learning models are good at certain things but they're not capable of everything. There's also questions of whether you have enough data robustness to even train the model on.

  3. Is there enough of an impact to pursue a solution? Like any good product, finding opportunities that can create a lot of impact is a much better ROI for your time and energy.


Identifying Problems

Finding user pain points for AI projects are the same as finding pain points for products - here's a good overview of different methods:

  • 1:1 interviews or focus groups with 6-8 users

  • Identify patterns, ask how they're solving their problem today

  • Find gaps in their current solution for opportunities to provide a better one


Observing problems in context

  • Field studies where you job shadow someone in their environment and find opportunities to improve

  • Users may not be aware of problems or of opportunities to improve

  • Can also be a fly on the wall to listen, observe, and find ways to make someone's job easier

  • Testing platforms like dscout diary studies also allow for remote field research, but being able to observe someone directly is likely to give the best results


Problems Good for Machine Learning (with Current Tech)

Not all problems can be solved with the current state of AI and machine learning. Examples of problems that are easy vs hard/nearly impossible to solve with ML:

Easy for ML

Hard for ML

Classifying objects e.g. image identification, spam detection, identifying diseases from medical scans

Handling very long-term dependencies - where data from much earlier context is important e.g. writing the next best-selling novel with consistent characters and plotlines

Recommendations and personalization

Handling multimodal inputs (e.g. image, audio and text) at the same time

Predictions

Ethical concerns

It also depends whether we have the data needed to solve the problem.


Impact vs Costs of ML Project

How much return or business impact can this ML solution deliver? Would it be worth the computational and maintenance costs to do it, or can a simpler solution be good enough?


Takeaway: Find real problems with high impact, that are a good fit for machine learning models, for which you can access the data needed.



Understand the Problem from Google Machine Learning Foundations

The post is worth reading on its own, but here are key takeaways.


Machine Learning can be broadly grouped into Predictive AI vs Generative AI.

  • Predictions should drive action - there's little value in a model that does not drive action


Always start with a heuristic (a simple quick solution) before attempting to use Machine Learning - a heuristic can be a product chart filter or a list of top selling products instead of going straight for a recommendation engine. The heuristic should be your baseline benchmark of performance to beat with an ML model.

  • An ML model will usually perform better than a heuristic but at what cost, and by how much of an improvement?


Data considerations for your ML model:

  • Abundance of relevant and useful examples

  • Consistent and reliable

  • Trustworthy from a credible source

  • Available as inputs for your ML model when needed at prediction time - if not, you're better off without that feature

  • Correctly labeled (no more than a few % of incorrectly labeled data)

  • Representative of the real world - real user behaviors or real-world phenomena, as much as possible, for the model to train on

  • Has predictive power (higher correlation means higher predictive power). Test predictive power by removing and adding back in a feature to see how much it changes your model performance.


Takeaway: ML models not easy to set up and run - start with a heuristic, evaluate if you have the right kind of data for an ML solution to work, and whether the ROI is there.



Validating Product Ideas

Converging on a solution through many small experiments - validating using the Scientific Method:

  • Start with a hypothesis

  • Test it with users

  • Analyze findings

  • Make a decision - continue or pivot

  • Refine hypothesis and repeat


Testing with mockups

Visualize the solution ASAP - you learn the most when testing low-fidelity mockups with real users. Assume you have technical feasibility, right now you're testing for viability.


Moving Forward to Product Development

Only after doing all these things should you consider moving into actual development:

  • Identifying a real problem with real business impact

  • Understand how it's being solved today, gaps/opportunities to improve

  • Confirm if ML is a suitable approach but find a heuristic first if possible

  • Converge on a potential solution via experiments with low-fidelity prototypes

  • Initial technical feasibility - even if it's hard, we at least know it's possible


Again, a lot of product management concepts are covered here. Yet 87% of ML projects fail because we tend to skip these key steps when we're pushed to deliver - avoid the temptation. You'll save lots more time and resources when you do the initial legwork of correctly identifying a real problem that's a good fit for ML solution and validating your idea with small experiments.


Takeaway: Don't skip the product validation process.



Benefits of ML in Products

What Machine Learning is good at:

  • Automation

  • Prediction

  • Personalization


Automation via Machine Learning is great at lowering cost and increasing quality in repetitive tasks, but it comes with risks:

  • Cannot adapt to major changes in its environment

  • Has no sense of ethics

  • Who has accountability if things go wrong?


ML is great at ingesting lots of data and finding patterns, allowing it to make personalized recommendations or predictions to drive decisions that would be hard for humans to do on their own. However, automating predictions is not recommended if there is a high cost to being wrong - e.g. medical diagnosis, judging court cases, or job hiring. Keeping a human in the loop, or augmenting human judgement and work quality by pairing them with an AI, can be wiser.


Heuristics vs Machine Learning

Before starting ML, a heuristic (a simpler solution) should be explored. These could be:

  • Business rules that are hard coded

  • Rules of thumb

  • E.g. using averages, recommending the highest rated product, predicting just one object that was the most common in image classification training


Heuristics have pros and cons - pros are they're easy to set up and computationally cheap to run and maintain. They're easy to understand. The cons being they have to be updated manually if business rules change, they usually don't perform as well as ML models, and they aren't suitable for handling lots of data.


Machine Learning can be retrained on new data, handles a wider variety of problems, can deal with large amounts of data and often perform better - but they're going to take more effort and resources to get up and running.


Heuristics should always be the first solution you attempt before going after a ML solution, at the very least to establish a benchmark to compare against using ML. If there is a valid problem that ML is a good fit for, and there's a huge upside (to offset the higher costs of ML), then ML can be considered.

Takeaway: ML has pros and cons - not advisable to automate if the cost of being wrong is very high. For cost effectiveness, start with a heuristic before the ML solution.


 

Like this post? Let's stay in touch!

Learn with me as I dive into AI and Product Leadership, and how to build and grow impactful products from 0 to 1 and beyond.


Follow or connect with me on LinkedIn: Muxin Li



コメント


© 2024 by Muxin Li

bottom of page