Background

I’ve been following COVID cases, deaths, and hospitalizations in Michigan & the US closely since the beginning of the pandemic. There’s lots of data to work with, which provides lots of opportunities for data analysis and visualization. This particular visualization is quite simple, and plots cases and deaths as a function of time. Whats unique about this plot is that it can show how deaths usually follow cases by 3 or so weeks.

The link for the data is scraped using Python, downloaded using R, cleaned/massaged in Python, and visualized in R’s Plotly interface

This data set contains cumulative cases, new cases, cumulative deaths, and new deaths for each county for each date in roughly the last two years. For my visualization, I simply want total cases and deaths in the state, so the data is grouped by date and deaths and cases are summed.

Initial Data Visualization:

With 7 day moving average and deaths:

Log 10 chart

A version of this page showing code is available here.

Reflection

What ideas/suggestions from Claus Wilke’s helped shape your visualization?

Storytelling/comparison drove what I was after. Here, I’m trying to show how COVID deaths follow COVID cases, and choices made in this visualization are made to show that comparison.

Is there anything more you wish you could do with this data?

Nothing. I wish I could have made cases and deaths more distinguishable on the plot, but plotly created lots of frustrations in differentiating the two, while keeping the line and point color for each category the same.

What were the most interesting or frustrating technical aspects of doing this?

As mentioned before, making more clear the difference between deaths and cases would have been nice.