5.0 / 5
We will cover the following tasks in 26 minutes:
In this task, I will discuss the goal of the course, which is to use the k-means clustering algorithm to properly group three species of flowers given only the Petal Length and Petal Width. I will also discuss the dataset. We will go over the Rhyme Interface and I will turn on my Webcam and give you a short bio about me!
I suggest always doing some exploratory analysis on a dataset before embarking on any project. I will show you a few ways to go about exploratory analysis in this task. We will use the
summary() call, and then use ggplot to graphically look at the data. Of course, there are many more techniques and depending on the project you may use these techniques or other techniques.
You can always look up the formula calls in the Help section by putting a question mark before the function. We will do that in this section with the k-means function. We will be using the dataframe, three centers and 20
nstarts. We can further explore the meaning of various arguments that we provide to the algorithm. We will also run the algorithm and store the trained k-means model in an object.
k-means Grouping (Graphically)
To see how the algorithm grouped the data graphically we need to first convert the clustered object into a factor. Then, it’s as easy as using ggplot on the original data and adding the cluster factors. I will show you how this is done in this task.
When looking at the accuracy visually we can usually do a decent job. However, if we need hard facts and hard data we will need to go a little further. This can be done using the table function. I will show you how to find out exactly how the algorithm performed using this function.