Background

For this project, I decided to recreate Hans Rosling’s population and GDP graph. The graph can be found here.

Since the video did not share the dataset, I downloaded another dataset from Our World in Data.

Data Overview and Wrangling

Although the year starts from 1870, I decided to use data only after 1950 because most countries miss values for population and GDP for years before 1950.

There are 166 countries in the dataset, but it misses information about which continent a country belongs to.

I used this website as a reference to continent, and manually labeled each country by using case_when function, but I should have just used left_join for the sake of efficiency…

Data Visualization

I used plotly and crosstalk to make a visualization. I think crosstalk does not go well with animation plots as the continents/countries shown are different from ones selected.

Show countries in North America

Better One

Improvement

For the improvement, I would like to make a similar graph to the Figure 13.10 from Wilke’s Fundamentals of Data Visualization.

The code was taken and modified from Wilke’s github repo.

The graph below shows how life expectancy and GDP per capita transition over time in Afghanistan. The transparency of the line represents year. In the future, I would like to incorporate the graph with crosstalk. It seems like crosstalk does not go well with geom_text_repel. I also need to figure out how to show only selected country in the graph (in the previous graphs, color = entity does it for me, but the graph below has nothing to differentiate countries if that makes sense).

Reflection