class: center, middle, inverse, title-slide .title[ # Interactive Graphs with Plotly and Dash ] .subtitle[ ## IDS Workshop on interactive data visualization Plotly and Dash in R ] .author[ ### Giorgio Coppola, Gayatri Shejwal, Lonny Chen ] .institute[ ### Hertie School ] --- class: inverse, center, middle name: welcome # Welcome! <html><div style='float:left'></div><hr color='#EB811B' size=1px style="width:1000px; margin:auto;"/></html> --- # Introduction Data visualization is a powerful mean to understand complex data and communicate insights effectively, revealing the hidden patterns and trends that wouldn't be obvious in tabular data, turning raw data into information. The tools we are going to introduce today serve exactly this purpose, enhancing the visualization possibilities offered by other static visualization packages such as `ggplot2`. <br> -- **What tools allow to do so?** Plotly and Dash! And we will introduce both! Indeed, `plotly` and `dash` are 💫 interactive 💫, in the sense that the user can interact with the data through the graph directly. This allows possibilities for higher understanding of the data. --- # Introduction ### Why interactive graphs? 📉 -- <br> * *Engagement*: Makes data come alive (interactive) and is more engaging for users. This increase understanding and retention. -- * *Depth*: Allows users to drill down and explore nuances, as interaction allows to create narratives more easily. Users can ask 'why' and get answers instantly by exploring the chart or graph. -- * *Decision-making*: Facilitates more informed decisions. <br> Both `plotly` and `dash` are interactive, but we can say that the level of interaction is different: `plotly` appears very similar to a static plot, with the difference that we can hover your cursor over it, accessing further data. `dash` on the other hand, creates proper dashboards, namely HTML apps, allowing even more interaction. --- # Plotly ### What is Plotly? -- Plotly is actually a company headquartered in Montreal, Canada, providing online graphing, analytics, and statistics tools, as well as scientific graphing libraries for Python, R, MATLAB, Perl, Julia, Arduino, JavaScript and REST. In this contex, with Plotly we refer to the R package for to create interactive graphs. -- # Plotly ### Advantages of using Plotly -- - Seamless integration with R and `ggplot2`, through the `ggplotly` function, that converts your static plots to an interactive web-based version! -- - Diverse range of plots and customization (wide array of chart types with extensive customization options), as well as the possibility to zoom in and out into the plot, or select a certain part of the plot and explore it specifically. Users can pan, zoom, and hover over the plots to get more information. -- - HTML web integration of graphs, makes visualization more accessible and intuitive. --- # Plotly ### Main distinctive features: the building blocks! 🔨 <br> Plotly's main feature is interactivity! It is achieved by a couple of operations within the `plot_ly` function: -- * **data**: a data frame (what a surprise!) * **x and y**: the variables for the x and y axes, respectively * **type**: the type of plot you want to create (e.g. "scatter", "bar", "box", etc). * **mode**: determines whether the scatter plot should show "markers", "lines", "text", or a combination thereof * **color**: specifies the color of data points or lines. This can be used to distinguish between different groups or represent a continuous variable -- Nothing really interactive until now... --- # Plotly ### Main distinctive features: the building blocks! 🔨 <br> ... What enables interaction: * **Hovertext & Tooltip**: - *Hovertext*: This is the text that appears when a user hovers over a specific data point or element in a Plotly plot. It's essentially a string or array of strings associated with each data point. - *Tooltip*: This is the box or interface that displays the hovertext. While "hovertext" refers specifically to the text content, "tooltip" encompasses the overall appearance and context of the hover information. <br> -- These can be highly customized: more info [here](https://plotly.com/r/hover-text-and-formatting/) for `plot_ly` and [here](https://plotly.com/ggplot2/hover-text-and-formatting/) for `ggplot2`, using `ggplotly`. --- # Plotly ### Some visualization examples. Some cool, non-trivial, yet not too complex, visualization that encapsulate the power of Plotly: <br> Let's play around a bit with the [Breweries and Beers by Style and US State Data](https://github.com/plotly/datasets/blob/master/beers.csv) 🍺! For the variables (ref: [Growler Guys](https://thegrowlerguys.com/understanding-abv-and-ibu/)): * `abv` - Alcohol by Volume (ABV), a standard measurement of alcohol content * `ibu` - International Bitterness Units (IBU), ranges from 5 to 100+. A beer over 60 is considered bitter. * `style_group` - eight beer types and "other" in the dataset (pre-processed) * `state` - US state We will create: 1. Histogram of ABV 2. Bar chart of beer types 3. Boxplot of ABV 4. Scatter plot of ABV vs. IBU --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # Plotly more in-depth: Examples ### Example 1: Histogram Let's create a plotly graph plotting the 'alcohol by volume' `abv` variable against the beer 'count', to see the distribution of beer production in terms of alcohol level. We can do this with one line using the `plot_ly` function and specifying type [`histogram`](https://plotly.com/r/histograms/)! <br> -- .pull-left[ ```r fig1a <- beers %>% plot_ly(x = ~abv, type = "histogram") ``` ] .pull-right[
] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # Plotly more in-depth: Examples ### Example 1: Histogram with formatting A few more arguments to `plot_ly` lets us specify `xbins` and `color`. Piping the `plot_ly` output into the `layout` function lets us add title, and axis formatting to the plot. <br> -- .pull-left[ ```r fig1b <- beers %>% plot_ly(x = ~abv, type = "histogram", xbins = list(start = 0, size = 0.005), color = "orange") %>% layout( title = "ABV Histogram", xaxis = list(title = "Alcohol by Volume (ABV)", tick0 = 0, dtick = 0.01), yaxis = list(title = "Count of beers"), bargap = 0.05 ) ``` ] .pull-right[
] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # Plotly more in-depth: Examples ### Example 2: Bar chart And which different types of beer are included and in which amounts? Let's see using a [`bar`](https://plotly.com/r/bar-charts/) chart of the 'count' of the `style_group` variable.<br> -- .pull-left[ ```r fig2 <- beers %>% count(style_group) %>% plot_ly(x = ~style_group, y = ~n, type = "bar") %>% layout(xaxis = list(categoryorder = "total descending")) ``` ] .pull-right[
] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # Plotly more in-depth: Examples ### Example 3: Boxplot What about a boxplot of type [`box`](https://plotly.com/r/box-plots/) showing how alcohol level varies for different types of beer using the `style_group` variable?<br> * What patterns do you see in the data? -- .pull-left[ ```r fig3 <- beers %>% plot_ly(x = ~abv, color = ~style_group, type = "box", showlegend = F) ``` ] .pull-right[
] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # Plotly more in-depth: Examples ### Example 4: Scatter plot And what about comparing alcohol level against bitterness? We need a [`scatter`](https://plotly.com/r/line-and-scatter/) plot! * Notice how the `color` and the powerful interactive `hovertext` arguments work to enhance the plot * What patterns do you see now? (Tip: filter out beer types in the legend) -- .pull-left[ ```r beer_hover = ~paste('Beer: ', beer, '</br>Style: ', style, '</br>Style Group: ', style_group) fig4 <- beers %>% plot_ly(x = ~abv, y = ~ibu, type = "scatter", mode = "markers", color = ~style_group, hovertext = beer_hover) %>% layout( title = "Beers scatter plot: ABV vs. IBU", xaxis = list(title = "Alcohol by Volume (ABV)"), yaxis = list(title = "International Bitterness Units (IBU)") ) ``` ] .pull-right[
] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # ggplot - plotly interaction ### ggplotly(): Transforming a ggplot graph into an interactive plotly graph! Good news if you are already experienced with `ggplot`! You can use it to create `plotly` graphs! <br> -- <br> **Tooltip** and **Hovertext** in `ggplotly`: The default behavior (without specifications) is to display the mapped aesthetics in the tooltip. For instance, if you have mapped the x and y aesthetics, then hovering over a point will display the x and y values in the tooltip... a bit redundant... -- So, how to customize? -- Only by combining the *text* aesthetic in ggplot2 with the *tootlip* argument (and [other customization options](https://plotly.com/ggplot2/)) in ggplotly, you can achieve a wide range of tooltip customizations in your interactive plots! ```r p <- ggplot(mtcars, aes(x = mpg, y = wt, text = paste("Car:", rownames(mtcars)))) + geom_point() p_interactive <- ggplotly(p, tooltip = "text") ``` --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 12px; } </style> # ggplot - plotly interaction ### ggplotly(): Transforming a ggplot graph into an interactive plotly graph! We have a `ggplot` histogram that we want to transform into a `plotly` with `ggplotly` . For example, we want to explore the beer production in California (the biggest beer producer in the US). -- .pull-left[ ```r # Filter data for California california_beers <- beers %>% filter(state == "CA") # Plot ggplot_ca_breweries <- ggplot(california_beers, aes(x = brewery, fill = style_group)) + geom_bar(position = "stack") + scale_fill_viridis_d() + theme_minimal() + labs(title = "Types of Beer Produced by Breweries in California", x = "Brewery", y = "Count") + theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5)) ``` ] .pull-right[ <!-- --> ] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 10px; } </style> # ggplot - plotly interaction ### ggplotly(): Transforming a ggplot graph into an interactive plotly graph! Let's see: Here, when you hover over the graph, you will be able to see the top 3 more alchoolic ALE style produced in every US state. -- .pull-left[ ```r california_beers_percentage <- california_beers %>% group_by(brewery, style_group) %>% tally() %>% group_by(brewery) %>% mutate(percentage = round(n / sum(n) * 100, 2)) ggplot_ca_breweries_percentage <- ggplot(california_beers_percentage, aes(x = brewery, fill = style_group, text = paste(style_group, ":", percentage, "%"))) + geom_bar(aes(y = n), stat = "identity", position = "stack") + scale_fill_viridis_d() + theme_minimal() + labs(title = "Types of Beer Produced by Breweries in California", x = "Brewery", y = "Count") + theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5)) p_plotly_ca_breweries <- ggplotly(ggplot_ca_breweries_percentage, tooltip = "text") ``` ] .pull-right[
] --- # Finally, Dash! <br> <div style="text-align:center;"> <img src="https://raw.githack.com/intro-to-data-science-23-workshop/10-plotlydash-coppola-shejwal-chen/main/others/plotly-dash-meme.jpg", width="600" height="500"/> </div> --- # Finally, Dash! ### What is Dash? A web application framework for R (and Python and other programming languages) that allows creating interactive, web-based data dashboards (indeed), turning your interactive plots into full-blown web applications. -- ### Key features -- - **HTML layout**: Design your dashboard layout using simple HTML components, without having to write any actual HTML. Indeed, it structure dashboard components using a set of pre-defined HTML tags within R. -- - Interactivity through **callbacks**: callbacks define interactive behavior by specifying how input changes should modify output components. Essentially, when changing a dropdown or move a slider, a callback updates a chart, text, or any other component in real-time. -- - **Integration with Plotly** visualizations: Plotly charts can be easily included in Dash, making them part of an interactive dashboard where each component can talk to others. --- # Plotly-Dash! ### Some visualization examples. A cool, non-trivial, yet not too complex, visualization that encapsulate the power of Dash: <br> Let's create a dash with a map of the public toilets in Berlin using the [Berlin Public Toilet Data](https://www.kaggle.com/datasets/ryanjt/berlin-public-toilets-location/) 🚽. Each included toilet has data for its: * Location: `Latitude`, `Longitude`, `PostalCode` mutated to `District` * Cost: mutated to "Free" or "Paid" * Services available: handicap accessible, changing table, etc. We will: 1. "Plot" the toilets to a location-based Plotly **Scatter Map** 2. Add **input component** to filter by cost and services and an *output component* 3. Use **callbacks** to connect the inputs to the map (as output) 4. Design the **layout** and run the Dash App! --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 12px; } </style> # Plotly-Dash ### Step 1: Scatter Map We will first create a Plotly [`scattermapbox`](https://plotly.com/r/scattermapbox/) using the location variables of the Toilets dataset. Important to note: * In the `mapbox` layout options, specify `style = 'open-street-map'` to use the free [OpenStreetMap](https://www.openstreetmap.org/) map layer, instead of the pay-for-tokens [Mapbox](https://www.mapbox.com/). .pull-left[ ```r map = toilets |> filter(District != "NA") |> plot_ly(lat = ~Latitude, lon = ~Longitude, type = "scattermapbox", split = ~District, size = 5, hoverinfo = "text+name", hovertext = ~Street) |> layout( title = "Berlin public toilets", mapbox = list( style = 'open-street-map', zoom = 9, center = list(lat = ~median(Latitude), lon = ~median(Longitude))) ) ``` ] .pull-right[ <div align="center"> <br>
</div> ] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # Plotly-Dash ### Step 2: Input and Output Components Let's create our first [Dash Core Components](https://dash.plotly.com/r/dash-core-components) to interact with our Plotly scatter map! * Input: a `dccChecklist` to filter among free and paid toilets * Output: just a placeholder `dccGraph` that can hold any Plotly figure .pull-left[ ```r #Input: dccChecklist (interactive) price_options <- c(unique(toilets$Free)) price_filter <- dccChecklist( id = 'price-filter', options = price_options, value = price_options #default, choose all ) ``` ] .pull-right[ ```r #Output: dccGraph (Plotly figures) toilets_map <- dccGraph(id = 'toilets-map') ``` ] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 12px; } </style> # Plotly-Dash ### Step 3: Callbacks Dash interaction is driven by [callbacks](https://dash.plotly.com/r/basic-callbacks) which require input and output component(s), and a callback function that links them. .pull-left[ ```r app |> add_callback( output('toilets-map', 'figure'), input('price-filter', 'value'), function(price_input) { toilets_filtered <- toilets |> filter(Free %in% price_input) fig = toilets_filtered |> filter(District != "NA") |> plot_ly(lat = ~Latitude, lon = ~Longitude, type = "scattermapbox", split = ~District, size = 5, hoverinfo = "text+name", hovertext = ~Street) |> layout( title = "Berlin public toilets", mapbox = list( style = 'open-street-map', zoom = 9, center = list(lat = ~median(Latitude), lon = ~median(Longitude))) ) return(fig) #not needed but just to be explicit } ) ``` ] .pull-right[ <br> ◀️ 1. Specify the interactive `input`, `output` components<br> ◀️ 2. Write a function that triggers on any `input` change<br> ◀️ 3. Filter the dataset based on `price_input`<br><br><br> ◀️ 4. Call the same Plotly `scattermapbox`<br> <br><br><br><br><br><br> ◀️ 5. Return the output figure to fill in the placeholder `dccGraph` from step #2 ] --- class: custom-slide <style> .custom-slide .remark-code, .custom-slide .remark-inline-code { font-size: 14px; } </style> # Plotly-Dash ### Step 4: Layout and Run Now we put it alll together for the App display [layout](https://dash.plotly.com/r/layout) using standard HTML components, and finally run the App to your browser. .pull-left[ ```r layout_elements <- list( h1("IDS Workshop - Plotly-Dash presentation"), div("Using the Berlin Public Toilets dataset to learn Scatter Map functionality."), br(), div("Cost filter:"), price_filter, br(), toilets_map ) #4 Main app app |> set_layout(layout_elements) app |> run_app() ``` ] .pull-right[  ] --- class: inverse, center, middle, custom-hr name: thank you # Thank you! <style type="text/css"> .custom-hr hr { border-color: #EB811B; width: 1000px; margin: auto; border-width: 1px 0 0 0; } </style> p.s. this is how public toilets in Berlin look like... <img src="https://raw.githack.com/intro-to-data-science-23-workshop/10-plotlydash-coppola-shejwal-chen/main/others/toilet.png" alt="How actually public toilet looks like in Berlin" width="500" style="display:block; margin:auto;"> <br> Now... interactive tutorial! You find the .Rmd file in the repository. Download it and open it in your R Studio, so that you can follow us directly and complete the exercises we prepared for you directly there! <br>