SlideShare a Scribd company logo
1 of 35
The Grammar of Graphics
Darya Vanichkina
… a personal commentary/interpretation of …
http://vita.had.co.nz/papers/layered-grammar.html
Grammar
“the fundamental principles or rules of an art or science”
(OED Online 1989)
Simple case
Simple case
Components of the grammar
Data Variables of interest
Aesthetics
x-axis

y-axis
colour

fill
size

labels
alpha

shape
line width

line type
Geometries point line histogram bar boxplot
Facets columns rows
Statistics binning smoothing descriptive inferential
Coordinates cartesian fixed polar limits
Themes Not data, but important for overall impact
Adapted from DataCamp
Exploring the diamonds data
Relationship between table (width of top of
diamond relative to widest point) and price?
diamonds %>% ggplot(aes(x = table, y = price))
+ geom_point()
Relationship between carat (weight) and price?
diamonds %>% ggplot(aes(x = carat, y = price))
+ geom_point()
Relationship between carat (weight) and price?
[How many diamonds?]
diamonds %>% ggplot(aes(x = carat, y = price))
+ geom_point(alpha = 0.1)
Relationship between carat (weight) and price?
[Lots of small diamonds!]
diamonds %>% ggplot(aes(x = carat, y = price)) +
geom_hex()
Components of the grammar
Data Variables of interest
Aesthetics
x-axis

y-axis
colour

fill
size

labels
alpha

shape
line width

line type
Geometries point line histogram bar boxplot
Facets columns rows
Statistics binning smoothing descriptive inferential
Coordinates cartesian fixed polar limits
Themes Not data, but important for overall impact
Adapted from DataCamp
Relationship between clarity and price?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = clarity, y = price)) +
geom_boxplot()
How many diamonds with each clarity?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = clarity, fill = clarity)) +
geom_bar()
Relationship between cut and price?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = cut, y = price)) +
geom_boxplot()
Relationship between cut and price?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = cut, y = price)) +
geom_boxplot()
Relationship between cut and price?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = cut, y = price)) +
geom_violin()
Components of the grammar
Data Variables of interest
Aesthetics
x-axis

y-axis
colour

fill
size

labels
alpha

shape
line width

line type
Geometries point line histogram bar boxplot
Facets columns rows
Statistics binning smoothing descriptive inferential
Coordinates cartesian fixed polar limits
Themes Not data, but important for overall impact
Adapted from DataCamp
Relationship between price and carat grouped by cut and clarity?
diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity)
Relationship between price and carat grouped by cut and clarity (with trend)?
diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity) + geom_smooth(method =
"lm")
Relationship between table parameter (width of top of diamond relative to
widest point, converted on the fly into a factor) and price, faceted by cut?
diamonds %>%
mutate(TableIsSmall = factor(table < median(table)))%>%
ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut)
Relationship between table parameter (width of top of diamond relative to
widest point, converted on the fly into a factor) and price, faceted by cut?
diamonds %>%
mutate(TableIsSmall = factor(table < median(table)))%>%
ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut)
Most diamonds with a small table (TableIsBig == FALSE) have an “Ideal” cut…
Components of the grammar
Data Variables of interest
Aesthetics
x-axis

y-axis
colour

fill
size

labels
alpha

shape
line width

line type
Geometries point line histogram bar boxplot
Facets columns rows
Statistics binning smoothing descriptive inferential
Coordinates cartesian fixed polar limits
Themes Not data, but important for overall impact
Adapted from DataCamp
Relationship between cut and price?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = cut, y = price)) +
geom_boxplot()
Relationship between cut and price?
[log-scale]
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = cut, y = price)) +
geom_boxplot()+ scale_y_log10()
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = cut, y = price)) +
geom_boxplot()
How many diamonds with each clarity?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar()
How many diamonds with each clarity?
Worse ——————————————————> Better
diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar() diamonds %>% ggplot(aes(x = clarity, fill = clarity))
+geom_bar(width=1) + coord_polar()
Components of the grammar
Data Variables of interest
Aesthetics
x-axis

y-axis
colour

fill
size

labels
alpha

shape
line width

line type
Geometries point line histogram bar boxplot
Facets columns rows
Statistics binning smoothing descriptive inferential
Coordinates cartesian fixed polar limits
Themes Not data, but important for overall impact
Adapted from DataCamp
Finishing up with the diamonds data
diamonds %>% ggplot(aes(x = carat, y = price)) +
geom_point() + facet_grid(cut~clarity,
scales = "free_x") + geom_smooth(method = “lm")
diamonds %>% ggplot(aes(x = cut, y = price))
+ geom_boxplot()+ scale_y_log10()
Finishing up with the diamonds data
diamonds %>% ggplot(aes(x = carat, y = price)) +
geom_point() + facet_grid(cut~clarity,
scales = "free_x") + geom_smooth(method = "lm") + theme_bw()
diamonds %>% ggplot(aes(x = cut, y = price))
+ geom_boxplot()+ scale_y_log10() + theme_minimal()
Just changing the themediamonds %>% ggplot(aes(x = carat, y = price)) +
geom_point(col = "red") + facet_grid(cut~clarity,
scales = "free_x") + geom_smooth(method = "lm") +
theme_black()
diamonds %>% ggplot(aes(x = cut, y = price))
+ geom_boxplot()+ scale_y_log10() + theme_dark()
The Grammar of Graphics
Data Variables of interest
Aesthetics
x-axis

y-axis
colour

fill
size

labels
alpha

shape
line width

line type
Geometries point line histogram bar boxplot
Facets columns rows
Statistics binning smoothing descriptive inferential
Coordinates cartesian fixed polar limits
Themes Not data, but important for overall impact
Limitations
Limitations
https://www.andrewheiss.com/blog/2017/08/10/exploring-minards-1812-plot-with-ggplot2/
The Grammar of Graphics
Data Variables of interest
Aesthetics
x-axis

y-axis
colour

fill
size

labels
alpha

shape
line width

line type
Geometries point line histogram bar boxplot
Facets columns rows
Statistics binning smoothing descriptive inferential
Coordinates cartesian fixed polar limits
Themes Not data, but important for overall impact
https://github.com/dvanic/18ResBazPresentation

More Related Content

What's hot

Simple Linear Regression (simplified)
Simple Linear Regression (simplified)Simple Linear Regression (simplified)
Simple Linear Regression (simplified)Haoran Zhang
 
Chapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research MalhotraChapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research MalhotraAADITYA TANTIA
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data GovernanceTuba Yaman Him
 
Storytelling with data and data visualization
Storytelling with data and data visualizationStorytelling with data and data visualization
Storytelling with data and data visualizationFrehiwot Mulugeta
 
Covariance and correlation(Dereje JIMA)
Covariance and correlation(Dereje JIMA)Covariance and correlation(Dereje JIMA)
Covariance and correlation(Dereje JIMA)Dereje Jima
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysisWansuklangk
 
Data Visualization1.pptx
Data Visualization1.pptxData Visualization1.pptx
Data Visualization1.pptxqwtadhsaber
 
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...DevGAMM Conference
 
Business Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesBusiness Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesDATAVERSITY
 
Linear Regression With R
Linear Regression With RLinear Regression With R
Linear Regression With REdureka!
 
Master Your Data. Master Your Business
Master Your Data. Master Your BusinessMaster Your Data. Master Your Business
Master Your Data. Master Your BusinessDLT Solutions
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysisVishwas N
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introductionsudhakara st
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management BasicKhaled Mosharraf
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Introduction to Data Visualization
Introduction to Data Visualization Introduction to Data Visualization
Introduction to Data Visualization Ana Jofre
 
Applications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationshipApplications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationshipRithish Kumar
 

What's hot (20)

Simple Linear Regression (simplified)
Simple Linear Regression (simplified)Simple Linear Regression (simplified)
Simple Linear Regression (simplified)
 
Chapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research MalhotraChapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research Malhotra
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data Governance
 
Storytelling with data and data visualization
Storytelling with data and data visualizationStorytelling with data and data visualization
Storytelling with data and data visualization
 
Covariance and correlation(Dereje JIMA)
Covariance and correlation(Dereje JIMA)Covariance and correlation(Dereje JIMA)
Covariance and correlation(Dereje JIMA)
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Regression
RegressionRegression
Regression
 
Data Visualization1.pptx
Data Visualization1.pptxData Visualization1.pptx
Data Visualization1.pptx
 
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
 
Business Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesBusiness Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data Strategies
 
Tableau
TableauTableau
Tableau
 
Linear Regression With R
Linear Regression With RLinear Regression With R
Linear Regression With R
 
Master Your Data. Master Your Business
Master Your Data. Master Your BusinessMaster Your Data. Master Your Business
Master Your Data. Master Your Business
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management Basic
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Introduction to Data Visualization
Introduction to Data Visualization Introduction to Data Visualization
Introduction to Data Visualization
 
Applications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationshipApplications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationship
 

Similar to Grammar of Graphics - Darya Vanichkina

Data visualization-2.1
Data visualization-2.1Data visualization-2.1
Data visualization-2.1RenukaRajmohan
 
Data Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdfData Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdfCarlosTrujillo199971
 
{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析Takashi Kitano
 
Stata cheat sheet: data visualization
Stata cheat sheet: data visualizationStata cheat sheet: data visualization
Stata cheat sheet: data visualizationTim Essam
 
r for data science 2. grammar of graphics (ggplot2) clean -ref
r for data science 2. grammar of graphics (ggplot2)  clean -refr for data science 2. grammar of graphics (ggplot2)  clean -ref
r for data science 2. grammar of graphics (ggplot2) clean -refMin-hyung Kim
 
Stata cheat sheet: Data visualization
Stata cheat sheet: Data visualizationStata cheat sheet: Data visualization
Stata cheat sheet: Data visualizationLaura Hughes
 
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver){tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)Takashi Kitano
 
Lectures r-graphics
Lectures r-graphicsLectures r-graphics
Lectures r-graphicsetyca
 
Exploratory data analysis using r
Exploratory data analysis using rExploratory data analysis using r
Exploratory data analysis using rTahera Shaikh
 
ggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality Graphicsggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality GraphicsClaus Wilke
 
Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorialAbhik Seal
 

Similar to Grammar of Graphics - Darya Vanichkina (20)

data-visualization.pdf
data-visualization.pdfdata-visualization.pdf
data-visualization.pdf
 
Ggplot2 cheatsheet-2.1
Ggplot2 cheatsheet-2.1Ggplot2 cheatsheet-2.1
Ggplot2 cheatsheet-2.1
 
Ggplot
GgplotGgplot
Ggplot
 
RBootcamp Day 4
RBootcamp Day 4RBootcamp Day 4
RBootcamp Day 4
 
Data visualization-2.1
Data visualization-2.1Data visualization-2.1
Data visualization-2.1
 
Data Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdfData Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdf
 
VISIALIZACION DE DATA.pdf
VISIALIZACION DE DATA.pdfVISIALIZACION DE DATA.pdf
VISIALIZACION DE DATA.pdf
 
{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析
 
Stata cheat sheet: data visualization
Stata cheat sheet: data visualizationStata cheat sheet: data visualization
Stata cheat sheet: data visualization
 
r for data science 2. grammar of graphics (ggplot2) clean -ref
r for data science 2. grammar of graphics (ggplot2)  clean -refr for data science 2. grammar of graphics (ggplot2)  clean -ref
r for data science 2. grammar of graphics (ggplot2) clean -ref
 
R graphics
R graphicsR graphics
R graphics
 
Stata cheat sheet: Data visualization
Stata cheat sheet: Data visualizationStata cheat sheet: Data visualization
Stata cheat sheet: Data visualization
 
MATHS SYMBOLS.pdf
MATHS SYMBOLS.pdfMATHS SYMBOLS.pdf
MATHS SYMBOLS.pdf
 
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver){tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
 
R training5
R training5R training5
R training5
 
ML-CheatSheet (1).pdf
ML-CheatSheet (1).pdfML-CheatSheet (1).pdf
ML-CheatSheet (1).pdf
 
Lectures r-graphics
Lectures r-graphicsLectures r-graphics
Lectures r-graphics
 
Exploratory data analysis using r
Exploratory data analysis using rExploratory data analysis using r
Exploratory data analysis using r
 
ggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality Graphicsggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality Graphics
 
Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorial
 

More from Darya Vanichkina

Sharing & Reusing Training Materials
Sharing & Reusing Training MaterialsSharing & Reusing Training Materials
Sharing & Reusing Training MaterialsDarya Vanichkina
 
Jumping into digital: Lessons learned while moving live coding machine learni...
Jumping into digital: Lessons learned while moving live coding machine learni...Jumping into digital: Lessons learned while moving live coding machine learni...
Jumping into digital: Lessons learned while moving live coding machine learni...Darya Vanichkina
 
Big data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionBig data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionDarya Vanichkina
 
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...Darya Vanichkina
 
Comparing the early ciRNA papers
Comparing the early ciRNA papers Comparing the early ciRNA papers
Comparing the early ciRNA papers Darya Vanichkina
 

More from Darya Vanichkina (6)

Sharing & Reusing Training Materials
Sharing & Reusing Training MaterialsSharing & Reusing Training Materials
Sharing & Reusing Training Materials
 
Jumping into digital: Lessons learned while moving live coding machine learni...
Jumping into digital: Lessons learned while moving live coding machine learni...Jumping into digital: Lessons learned while moving live coding machine learni...
Jumping into digital: Lessons learned while moving live coding machine learni...
 
ANDS_TrainingTheTrainer
ANDS_TrainingTheTrainerANDS_TrainingTheTrainer
ANDS_TrainingTheTrainer
 
Big data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionBig data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolution
 
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
 
Comparing the early ciRNA papers
Comparing the early ciRNA papers Comparing the early ciRNA papers
Comparing the early ciRNA papers
 

Recently uploaded

Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 

Recently uploaded (20)

Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 

Grammar of Graphics - Darya Vanichkina

  • 1. The Grammar of Graphics Darya Vanichkina
  • 2. … a personal commentary/interpretation of … http://vita.had.co.nz/papers/layered-grammar.html
  • 3. Grammar “the fundamental principles or rules of an art or science” (OED Online 1989)
  • 6. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour fill size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian fixed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
  • 8. Relationship between table (width of top of diamond relative to widest point) and price? diamonds %>% ggplot(aes(x = table, y = price)) + geom_point()
  • 9. Relationship between carat (weight) and price? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point()
  • 10. Relationship between carat (weight) and price? [How many diamonds?] diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(alpha = 0.1)
  • 11. Relationship between carat (weight) and price? [Lots of small diamonds!] diamonds %>% ggplot(aes(x = carat, y = price)) + geom_hex()
  • 12. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour fill size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian fixed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
  • 13. Relationship between clarity and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, y = price)) + geom_boxplot()
  • 14. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity)) + geom_bar()
  • 15. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
  • 16. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
  • 17. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_violin()
  • 18. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour fill size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian fixed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
  • 19. Relationship between price and carat grouped by cut and clarity? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity)
  • 20. Relationship between price and carat grouped by cut and clarity (with trend)? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity) + geom_smooth(method = "lm")
  • 21. Relationship between table parameter (width of top of diamond relative to widest point, converted on the fly into a factor) and price, faceted by cut? diamonds %>% mutate(TableIsSmall = factor(table < median(table)))%>% ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut)
  • 22. Relationship between table parameter (width of top of diamond relative to widest point, converted on the fly into a factor) and price, faceted by cut? diamonds %>% mutate(TableIsSmall = factor(table < median(table)))%>% ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut) Most diamonds with a small table (TableIsBig == FALSE) have an “Ideal” cut…
  • 23. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour fill size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian fixed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
  • 24. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
  • 25. Relationship between cut and price? [log-scale] Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
  • 26. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar()
  • 27. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar() diamonds %>% ggplot(aes(x = clarity, fill = clarity)) +geom_bar(width=1) + coord_polar()
  • 28. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour fill size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian fixed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
  • 29. Finishing up with the diamonds data diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = “lm") diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10()
  • 30. Finishing up with the diamonds data diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = "lm") + theme_bw() diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() + theme_minimal()
  • 31. Just changing the themediamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(col = "red") + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = "lm") + theme_black() diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() + theme_dark()
  • 32. The Grammar of Graphics Data Variables of interest Aesthetics x-axis y-axis colour fill size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian fixed polar limits Themes Not data, but important for overall impact
  • 35. The Grammar of Graphics Data Variables of interest Aesthetics x-axis y-axis colour fill size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian fixed polar limits Themes Not data, but important for overall impact https://github.com/dvanic/18ResBazPresentation