streamgraph is an htmlwidget JavaScript/D3 chart library.

Installation

devtools::install_github("hrbrmstr/streamgraph")

Usage

The streamgraph pacakge is an htmlwidget1 that is based on the D3.js2 JavaScript library.

“Streamgraphs are a generalization of stacked area graphs where the baseline is free. By shifting the baseline, it is possible to minimize the change in slope (or wiggle) in individual series, thereby making it easier to perceive the thickness of any given layer across the data. Byron & Wattenberg describe several streamgraph algorithms in ‘Stacked Graphs—Geometry & Aesthetics3’”4

Even though streamgraphs can be controversial5, they make for very compelling visualizations, especially when displaying very large datasets. They work even better when there is an interactive component involved that enables the following of each “flow” or allow filtering the view in some way. This makes R a great choice for streamgraph creation & exploration given that it excels at data manipulation and has libraries such as Shiny6 that reduce the complexity of the creation of interactive interfaces.

Making a streamgraph

The first example mimics the streamgraphs in the Name Voyager7 project. We’ll use the R babynames package8 as the data source and use the streamgraph package to see the ebb & flow of “Kr-” and “I-” names in the United States over the years (1880-2013).

library(dplyr)
library(babynames)
library(streamgraph)

babynames %>%
  filter(grepl("^Kr", name)) %>%
  group_by(year, name) %>%
  tally(wt=n) %>%
  streamgraph("name", "n", "year")

You create streamgraphs with the streamgraph function. This first example uses the default values for the aesthetic properties of the streamgraph, but we have passed in “name”, “n” and “year” for the key, value and date parameters. If your data already has column names in the expected format, you do not need to specify any values for those parameters.

The current version of streamgraph requires a date-based x-axis, but is smart enough to notice if the values for the date column are years and automatically performs the necessary work under the covers to convert the data into the required format for the underlying D3 processing.

The default behavior of the streamgraph function is to have the graph centered in the y-axis, with smoothed “streams”.

library(dplyr)
library(babynames)
library(streamgraph)

babynames %>%
  filter(grepl("^I", name)) %>%
  group_by(year, name) %>%
  tally(wt=n) %>%
  streamgraph("name", "n", "year", offset="zero", interpolate="linear") %>%
  sg_legend(show=TRUE, label="I- names: ")