Project: Evaluate Machine Learning Models with Yellowbrick

We are going to use visualizations to steer our machine learning workflow. The problem we will tackle is to predict whether rooms in apartments are occupied or unoccupied based on passive sensor data such as Temperature, Humidity, Light and CO2 levels. This is the continuation of the last project on Room Occupancy Detection.

Join for Free
Project: Evaluate Machine Learning Models with Yellowbrick

Duration (mins)


NA / 5


Task List

We will cover the following tasks in 47 minutes:


ROCAUC curves are a good way to see overall how well your classifier is generalizing or performing. The Receiver Operating Characteristic (ROC) is a measure of a classifier’s predictive quality that compares and visualizes the tradeoff between the model’s sensitivity and specificity.

We will also touch on another important topic that is often glossed over. Data snooping can significantly harm out of sample performance by contaminating our analysis. We will learn to recognize, account for, and avoid data snooping.

Classification Report and Confusion Matrix

We have a few good options for visual classification scoring. In the last task, we explored the classic ROCAUC. That is a good way to see, overall, how good our classifier is performing. But we can also use Yellowbrick to start diagnosing some of the problems. So, in this task, we’re going to create a visual classification report heatmap and a visual confusion matrix. They are helpful not just with overall accuracy with the f1 score, but to also start getting a sense for places where our model is performing better or worse.

Cross-validation Scores

Real-world data is often distributed somewhat unevenly, meaning that the fitted model is likely to perform better on some sections of the data than on others.

In this task, we will first visualize our model’s performance on different train/test splits. We will use Yellowbrick’s CVScores visualizer to visually explore these variations in performance using different cross validation strategies.

Next, we use a StratifiedKFold cross-validation strategy to ensure all of our classes in each split are represented with the same proportion.

Evaluating Class Balance

A very common question during model evaluation is, “Why isn’t the model I’ve picked predictive?”. After completing this task, you will have a good answer to this question. The idea centers on the imbalance between classes within your data. We will also learn best practices to accommodate for such imbalances such that they do not adversely affect model performance.

Discrimination Threshold for Logistic Regression

Our Logistic Regression model is a binary classifier. We can use a discrimination threshold plot to evaluate how well our classifier is performing on metrics such as f1-scores, recall, precision, and queue rates.

More generally, based on your application or business needs, you can use this visualization to quickly hone in on where you want to set that threshold.

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Snehan Kekre

About the Host (Snehan Kekre)

Snehan Kekre is a Machine Learning and Data Science Instructor at Coursera. He studied Computer Science and Artificial Intelligence at Minerva Schools at KGI, based in San Francisco. His interests include AI safety, EdTech, and instructional design. He recognizes that building a deep, technical understanding of machine learning and AI among students and engineers is necessary in order to grow the AI safety community. This passion drives him to design hands-on, project-based machine learning courses on Rhyme.

Frequently Asked Questions

In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Snehan Kekre) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Snehan Kekre) has already installed all required software and configured all data.
Absolutely! Your host (Snehan Kekre) has provided this session completely free of cost!
You can go to, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at
Rhyme strives to ensure that visual instructions are helpful for reading impairments. The Rhyme interface has features like resolution and zoom that will be helpful for visual impairments. And, we are currently developing a close-caption functionality to help with hearing impairments. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. If you have questions related to accessibility, please email us at
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
Please email us at and we'll respond to you within one business day.

Ready to join this 47 minutes session for free?

More Projects by Snehan Kekre