We will cover the following tasks in 42 minutes:
Loading the Data
In this task, we will first demo Hans Rosling’s visualization of the Gapminder data set. The interactive, animated visualization was shown to the audience of Hans’ Ted talk in 2007. It has gone one to become one of the most watched Ted talks of all time, and is a testament to the power of beautiful and informative data visualizations.
We will then be introduced to the project goals and learning outcomes. Once we are familiarized with the Rhyme interface, we begin working in Jupyter Notebook, a web-based interactive computational environment for creating notebook documents.
Quick Visualizations with Custom Bar Charts
Now that we have imported the data, we are free to use plotly express to explore various facets of this rich data set.
Plotly Express functions take as a first argument a tidy
pandas.DataFrame. In this task, we will graph the population of Canada by year using a bar plot. In a bar plot, each row of the DataFrame is represented as a rectangular mark. We will also customize the bar plot using keyword arguments to color the bars according to the average life expectancy.
Plot Life Expectancy vs GDP per Capita
We will create a basic scatter plot showing life expectancy vs GDP per captita by country for 2007.
Next, we will break that down by continent, by coloring the points using the
color argument, while
px takes care of assigning the default colors, setting up the legend, etc.
Customize Interactive Bubble Charts
Each point in our last plot is a country. So to scale the points by the country population we simply pass in the
If we’re curious about the identity of a particular point, we can add a
hover_name to display the country name. We will never again have to worry about what a particular outlier represents. We can simply mouse over the point we’re interested in and the
hover_name will identify it for us!
Create Interactive Animations and Facet Plots
We are able to easily interact with the plots we have created so far. Try mousing over points, clicking or double-clicking on legend items, or using the “modebar” that appears when you move your mouse into the frame to control the behaviour click-drag interactions (zoom, pan, select).
We can also facet our plots to pick apart the continents, just as easily as coloring your points, with
facet_col="continent", and let’s make the x-axis logarithmic to see things more clearly while we’re at it.
Maybe we’re interested in more than just 2007 want to see how this chart evolved over time. We can animate it by setting
animation_group="country" to identify which circles match which ones across frames). We can provide prettier
labels that get applied throughout the figure, in legends, axis titles and hovers. We can also provide some manual bounds so the animation looks nice throughout.
Represent Geographic Data as Animated Maps
As this is geographic data, we can also represent it as an animated map, which makes it clear that px can make way more than just scatter plots, and that this dataset is missing data for the former Soviet Union.
About the Host (Snehan Kekre)
Snehan hosts Machine Learning and Data Sciences projects at Rhyme. He is in his senior year of university at the Minerva Schools at KGI, studying Computer Science and Artificial Intelligence. When not applying computational and quantitative methods to identify the structures shaping the world around him, he can sometimes be seen trekking in the mountains of Nepal.