Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- What to Upload to SlideShare by SlideShare 6489103 views
- Customer Code: Creating a Company C... by HubSpot 4834256 views
- Be A Great Product Leader (Amplify,... by Adam Nash 1082750 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 1269561 views
- APIdays Paris 2019 - Innovation @ s... by apidays 1523666 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 1110772 views

Presented as part of ResBaz 2018 in Sydney.

No Downloads

Total views

169

On SlideShare

0

From Embeds

0

Number of Embeds

5

Shares

0

Downloads

0

Comments

5

Likes

3

No notes for slide

- 1. The Grammar of Graphics Darya Vanichkina
- 2. … a personal commentary/interpretation of … http://vita.had.co.nz/papers/layered-grammar.html
- 3. Grammar “the fundamental principles or rules of an art or science” (OED Online 1989)
- 4. Simple case
- 5. Simple case
- 6. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 7. Exploring the diamonds data
- 8. Relationship between table (width of top of diamond relative to widest point) and price? diamonds %>% ggplot(aes(x = table, y = price)) + geom_point()
- 9. Relationship between carat (weight) and price? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point()
- 10. Relationship between carat (weight) and price? [How many diamonds?] diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(alpha = 0.1)
- 11. Relationship between carat (weight) and price? [Lots of small diamonds!] diamonds %>% ggplot(aes(x = carat, y = price)) + geom_hex()
- 12. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 13. Relationship between clarity and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, y = price)) + geom_boxplot()
- 14. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity)) + geom_bar()
- 15. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 16. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 17. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_violin()
- 18. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 19. Relationship between price and carat grouped by cut and clarity? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity)
- 20. Relationship between price and carat grouped by cut and clarity (with trend)? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity) + geom_smooth(method = "lm")
- 21. Relationship between table parameter (width of top of diamond relative to widest point, converted on the ﬂy into a factor) and price, faceted by cut? diamonds %>% mutate(TableIsSmall = factor(table < median(table)))%>% ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut)
- 22. Relationship between table parameter (width of top of diamond relative to widest point, converted on the ﬂy into a factor) and price, faceted by cut? diamonds %>% mutate(TableIsSmall = factor(table < median(table)))%>% ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut) Most diamonds with a small table (TableIsBig == FALSE) have an “Ideal” cut…
- 23. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 24. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 25. Relationship between cut and price? [log-scale] Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 26. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar()
- 27. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar() diamonds %>% ggplot(aes(x = clarity, fill = clarity)) +geom_bar(width=1) + coord_polar()
- 28. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 29. Finishing up with the diamonds data diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = “lm") diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10()
- 30. Finishing up with the diamonds data diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = "lm") + theme_bw() diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() + theme_minimal()
- 31. Just changing the themediamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(col = "red") + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = "lm") + theme_black() diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() + theme_dark()
- 32. The Grammar of Graphics Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact
- 33. Limitations
- 34. Limitations https://www.andrewheiss.com/blog/2017/08/10/exploring-minards-1812-plot-with-ggplot2/
- 35. The Grammar of Graphics Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact https://github.com/dvanic/18ResBazPresentation

No public clipboards found for this slide

Login to see the comments