streamgraph is an htmlwidget JavaScript/D3 chart library.
devtools::install_github("hrbrmstr/streamgraph")
The streamgraph
pacakge is an htmlwidget
1 that is based on the D3.js
2 JavaScript library.
“Streamgraphs are a generalization of stacked area graphs where the baseline is free. By shifting the baseline, it is possible to minimize the change in slope (or wiggle) in individual series, thereby making it easier to perceive the thickness of any given layer across the data. Byron & Wattenberg describe several streamgraph algorithms in ‘Stacked Graphs—Geometry & Aesthetics3’”4
Even though streamgraphs can be controversial5, they make for very compelling visualizations, especially when displaying very large datasets. They work even better when there is an interactive component involved that enables the following of each “flow” or allow filtering the view in some way. This makes R a great choice for streamgraph creation & exploration given that it excels at data manipulation and has libraries such as Shiny6 that reduce the complexity of the creation of interactive interfaces.
The first example mimics the streamgraphs in the Name Voyager7 project. We’ll use the R babynames
package8 as the data source and use the streamgraph
package to see the ebb & flow of “Kr-
” and “I-
” names in the United States over the years (1880-2013).
library(dplyr)
library(babynames)
library(streamgraph)
babynames %>%
filter(grepl("^Kr", name)) %>%
group_by(year, name) %>%
tally(wt=n) %>%
streamgraph("name", "n", "year")
You create streamgraphs with the streamgraph
function. This first example uses the default values for the aesthetic properties of the streamgraph, but we have passed in “name
”, “n
” and “year
” for the key
, value
and date
parameters. If your data already has column names in the expected format, you do not need to specify any values for those parameters.
The current version of streamgraph
requires a date-based x-axis, but is smart enough to notice if the values for the date
column are years and automatically performs the necessary work under the covers to convert the data into the required format for the underlying D3 processing.
The default behavior of the streamgraph
function is to have the graph centered in the y-axis, with smoothed “streams”.
library(dplyr)
library(babynames)
library(streamgraph)
babynames %>%
filter(grepl("^I", name)) %>%
group_by(year, name) %>%
tally(wt=n) %>%
streamgraph("name", "n", "year", offset="zero", interpolate="linear") %>%
sg_legend(show=TRUE, label="I- names: ")