Scraping Data with Rvest: Webscraping, Cleaning, and Sentiment Analysis

We’re going to cover a lot in this course and by the end of it you will have a basic framework for scraping data, cleaning it and performing simple Sentiment Analysis. We will have fun along the journey together.
Join me as we look to see whether or not I (or you) should do Red Light Therapy.

Available On Coursera
Scraping Data with Rvest: Webscraping, Cleaning, and Sentiment Analysis

Duration (mins)


5.0 / 5


Task List

We will cover the following tasks in 36 minutes:


In this section we will discuss course objectives, the Rhyme Interface and a short bio about me. Come in and see what this one is all about.

Story and Scraping Time

Really? Red Light Therapy? Ok. Well in this section we’re going to discuss how this analysis came about and we will scrape the data (326 Reviews) using Rvest and the tools we used in the previous Data Harvesting using Rvest course.

Cleaning the Data

Smoke is to fire what Dirty is to Data. Clean data is a dream. More often than not we have to clean it. This is what makes us invaluable to our teams, to our analysis, etc. In this lesson I will show you some down and dirty base R ways of cleaning data up.

More Cleaning with a For Loop

When you can’t think of a function off hand or are unaware of one use a For Loop. We will use a For Loop in this section to strip out empty columns. Sure there are other ways, but this is good practice. If you’re not familiar with For Loops it’s ok. See what you can learn and google what you can’t.


It’s time to take the Reviews and break them into single words. We do this so we can strip out extraneous words and those that don’t add have much meaning. We will be using the Reshape2 package to get this done. Come in and join me on this one.

Joining Sentiments

In this lesson we will go over the Sentiment Libraries. Cool stuff. We will then join our words with the bing sentiment library using inner_join from the dplyr package. Once you see how this is done it will open all sorts of doors to your text analysis.

Comparison Cloud

In this lesson we will build our comparison cloud using the wordcloud package. But first we need to separate the negative words from the positive ones so we can build the comparison part of the cloud.

Final Thoughts

Our analysis was simple but it provides a framework for future tutorials and your future projects. Bring a pen and paper to write down a very important book by Julia Silge, which is online and free. And of course you are learning a ton! And for that I want to congratulate you!

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Rhyme Authors

About the Host

Rhyme Authors

Frequently Asked Questions

In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Rhyme Authors) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Rhyme Authors) has already installed all required software and configured all data.
You can go to, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at
Rhyme strives to ensure that visual instructions are helpful for reading impairments. The Rhyme interface has features like resolution and zoom that will be helpful for visual impairments. And, we are currently developing a close-caption functionality to help with hearing impairments. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. If you have questions related to accessibility, please email us at
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
Please email us at and we'll respond to you within one business day.

No sessions available