SlideShare a Scribd company logo
1 of 38
Download to read offline
GGPLOT IN PYTHON
-By Sarah Masud
INTRODUCTION:
This ppt will cover the basic functions of ggplot in python. This will help
beginners to understand what the functions mean and how to use them.
SELF HELP: If you don’t remember the function or wish to know more about it, you can use the
help function in python by simply typing the function name followed by a ?
EXAMPLE:
INPUT:
OUTPUT:
PREREQUISITES INSTALLED:
● pip
● easy install
● matplotlib
● pandas
● numpy
● scipy
● statmodels
INSTALL ggplot UNDER PYTHON:
FOR LINUX
Method 1: pip install ggplot
Method 2: pip install git+git://github.com/yhat/ggplot.git
SOURCE OF DATA:
Big Diamonds data set is used through the presentation. You can
download the data set from:
https://github.com/SolomonMg/diamonds-data
HOW TO OBTAIN THE CSV:
METHOD 1: Read the RDA file in R and writeback as CSV.
METHOD 2: Use rpy2.
WHAT IS ggplot?
Created by H. Wickman, ggplot provides an easy interface to generate state of art visualizations.
Written originally for R, its success enabled it be used for Python as well.
COMPONENTS OF ggplot:
● ggplot API- Used to implement the plots.
● Data- Uses data as Data Frames as in pandas.
● Aesthetics- How the axes and theme looks.
● Layer- what information is annotated on top of basic plot.
TIME TAKEN TO EXECUTE
A FUNCTION:
Source: http://stackoverflow.com/questions/6786990/find-out-time-it-took-for-a-python-script-to-complete-
execution
INPUT THE DATA:
Import necessary packages.
Read Data:
EXPLORE THE DATA:
1. len()- Number of rows in the dataset
2. column()- What are names of the columns.
WHAT DOES THE COLUMNS CONTAIN:
1. Carat- Weight of the diamond (1 carat=0.2g)
2. Cut- Quality of cut
3. Color- Color of diamond (J-worst D-best)
4. Clarity- A measure of how clear the diamond is.
5. Cert- The level of certification granted.
6. x- Length in mm.
7. y- Breadth in mm.
8. z- Height in mm.
9. Measurement- Volume in terms of x*y*z.
10. Table- Width of top of diamond relative to widest point.
11. Depth- Numerically = (2*z) /(x+y)
3. head()/tail()- To know the first few & last few values, respectively.
NOTE: The dataset contains
both quantitative(numeric)
and qualitative fields.
4. Random selection- To see data values at random.
5. Describe()- Give the mathematical details of fields with numerical value.
NOTE: The mean of x,y are approximately
same. Do diamonds have proportionate
length/breadth?
7. unique()- To know the unique(1 or many) values that make up the dataset.
PREPARING DATA:
1.Check for null values.
2. Check for zero price values.
3. Obtain clean data set by removing null values.
EVALUATION OF DATA:
1. New Statistical Information:
2. Correlations
3. The plot of density of diamond
NOTE:
stat.lingress is used to calculate the components of the line of best fit of the form y=mx+c, where m=slope
and c=y-intercept. The r_value is the regression coefficient, the p_value s a constant usually zero, while
std_err is the error of estimation.
ggplot(dataset,aesthetics(y,x)- Gives us a blank coordinate system
geom_points- Plots the dataset on the blank plot.
scale_y/x_continous- Use to give name and range of the axis.
geom_abline- Draw a line of form y=mx+c.
OUTPUT:
HOW IT WORKS:
1. ggplot is invoked.
2. A blank coordinate system
with labeled axes is put up.
3. The points are plotted.
4. The axis redefined and
cropped.
5. The line draw as another
layer on top of the points.
PRICE EVALUATION:
Diamonds are expensive!
Let us try to map what factors make them so.
PRICE VS
BREADTH
NOTE:
labs-use to label the
graph and the
axises.
x-lab and y-lab can
also be separately
used.
stats_smooth
provides a
mechanism to plot
the line of
regression and help
determine the
relation among the
variants.
PRICE VS
LENGTH
PRICE VS
HEIGHT
PRICE VS VOLUME
NOTE:
Geom-jitter
Over-plotting hides the
number of points in
each neighbourhood.
We can reduce this
problem by making the
points more
transparent.
PRICE VS TABLE
NOTE:
A horizontal line of
regression means that
value of f(x) can be
calculated without much
consideration of the
value of x.
Thus, price is not
considerably affected by
table and can be
calculated without taking
table into account.
PRICE VS DEPTH
PRICE VS CARATS
NOTE:
A quadratic line of
regression signifies
that value of price
depends on the
value of carat. But is
only carat, lets see
closely.
PRICE vs CARAT
NOTE:
Since the original
Price VS carat
graph was not
providing us
accurate
information, we
narrow down the
scale to a
particular section.
In the next slide
we narrow it down
further.
NOTE:
We see that the
plot becomes
vertical, i.e for the
same value of
carat we have
varying price.
Surely some other
factor is controlling
it.
NOTE:
This is plotting the
price with respect to
the cut.
We see that for a
given carat value the
quality of changes
the price.
Differentiate
price VS carat
with respect to
cut.
Differentiate
price VS carat
with respect to
color.
Differentiate
price VS carat
with respect to
clarity.
NOTE:
Facets- It features
the same set of
data with respect to
a given factor. This
helps us determine
which value of
factor affects f(x)
the most,
FACETS
FACETS
This presentation is a part of the larger pool of learning resources provided by
DecisionStats.org
FURTHER SOURCES:
1. https://decisionstats.org
2. https://github.com/SolomonMg/diamonds-data
3. https://themessier.wordpress.com/2015/06/17/ggplot-in-python-part-1
4. http://nbviewer.ipython.org/gist/sara-02/d5a61234ef32e60bddda
5. http://nbviewer.ipython.org/gist/sara-02/d38da4a2023da169ac13
6. https://gist.github.com/sara-02/4eb520fd1b82521e8a11
CITATION:
1. http://ggplot.yhathq.com/
2. https://github.com/yhat/ggplot
FOR QUERIES:
info@decisionstats.org
sarahmasud02@gmail.com
THANK YOU

More Related Content

Similar to Ggplot in python

Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorialAbhik Seal
 
c++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptxc++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptxJanineCallangan
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reductionMarco Quartulli
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analyticsCollin Bennett
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r Ashwini Mathur
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpankit_ppt
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Spencer Fox
 
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table packageJanuary 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table packageZurich_R_User_Group
 
Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs ShahDhruv21
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaPyData
 
De-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekendsDe-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekendsDSCUSICT
 
Building Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep LearningBuilding Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep LearningAnurag Sahay
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionFranck Pachot
 

Similar to Ggplot in python (20)

Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorial
 
c++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptxc++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptx
 
Ggplot2 ch2
Ggplot2 ch2Ggplot2 ch2
Ggplot2 ch2
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
 
vb.net.pdf
vb.net.pdfvb.net.pdf
vb.net.pdf
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Python
PythonPython
Python
 
R Programming
R ProgrammingR Programming
R Programming
 
Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlp
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table packageJanuary 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
 
Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
De-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekendsDe-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekends
 
Twopi.1
Twopi.1Twopi.1
Twopi.1
 
Building Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep LearningBuilding Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep Learning
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
 

More from Ajay Ohri

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay OhriAjay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to RAjay Ohri
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionAjay Ohri
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for freeAjay Ohri
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10Ajay Ohri
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri ResumeAjay Ohri
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...Ajay Ohri
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data scienceAjay Ohri
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessAjay Ohri
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in PythonAjay Ohri
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen OomsAjay Ohri
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsAjay Ohri
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha Ajay Ohri
 
Analyze this
Analyze thisAnalyze this
Analyze thisAjay Ohri
 

More from Ajay Ohri (20)

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
 
Pyspark
PysparkPyspark
Pyspark
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Craps
CrapsCraps
Craps
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
 
Analyze this
Analyze thisAnalyze this
Analyze this
 

Recently uploaded

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Recently uploaded (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

Ggplot in python

  • 1. GGPLOT IN PYTHON -By Sarah Masud
  • 2. INTRODUCTION: This ppt will cover the basic functions of ggplot in python. This will help beginners to understand what the functions mean and how to use them. SELF HELP: If you don’t remember the function or wish to know more about it, you can use the help function in python by simply typing the function name followed by a ? EXAMPLE: INPUT: OUTPUT:
  • 3. PREREQUISITES INSTALLED: ● pip ● easy install ● matplotlib ● pandas ● numpy ● scipy ● statmodels INSTALL ggplot UNDER PYTHON: FOR LINUX Method 1: pip install ggplot Method 2: pip install git+git://github.com/yhat/ggplot.git
  • 4. SOURCE OF DATA: Big Diamonds data set is used through the presentation. You can download the data set from: https://github.com/SolomonMg/diamonds-data HOW TO OBTAIN THE CSV: METHOD 1: Read the RDA file in R and writeback as CSV. METHOD 2: Use rpy2.
  • 5. WHAT IS ggplot? Created by H. Wickman, ggplot provides an easy interface to generate state of art visualizations. Written originally for R, its success enabled it be used for Python as well. COMPONENTS OF ggplot: ● ggplot API- Used to implement the plots. ● Data- Uses data as Data Frames as in pandas. ● Aesthetics- How the axes and theme looks. ● Layer- what information is annotated on top of basic plot.
  • 6. TIME TAKEN TO EXECUTE A FUNCTION: Source: http://stackoverflow.com/questions/6786990/find-out-time-it-took-for-a-python-script-to-complete- execution
  • 7. INPUT THE DATA: Import necessary packages. Read Data:
  • 8. EXPLORE THE DATA: 1. len()- Number of rows in the dataset 2. column()- What are names of the columns.
  • 9. WHAT DOES THE COLUMNS CONTAIN: 1. Carat- Weight of the diamond (1 carat=0.2g) 2. Cut- Quality of cut 3. Color- Color of diamond (J-worst D-best) 4. Clarity- A measure of how clear the diamond is. 5. Cert- The level of certification granted. 6. x- Length in mm. 7. y- Breadth in mm. 8. z- Height in mm. 9. Measurement- Volume in terms of x*y*z. 10. Table- Width of top of diamond relative to widest point. 11. Depth- Numerically = (2*z) /(x+y)
  • 10. 3. head()/tail()- To know the first few & last few values, respectively. NOTE: The dataset contains both quantitative(numeric) and qualitative fields.
  • 11. 4. Random selection- To see data values at random.
  • 12. 5. Describe()- Give the mathematical details of fields with numerical value. NOTE: The mean of x,y are approximately same. Do diamonds have proportionate length/breadth?
  • 13. 7. unique()- To know the unique(1 or many) values that make up the dataset.
  • 14. PREPARING DATA: 1.Check for null values. 2. Check for zero price values.
  • 15. 3. Obtain clean data set by removing null values.
  • 16. EVALUATION OF DATA: 1. New Statistical Information:
  • 18. 3. The plot of density of diamond NOTE: stat.lingress is used to calculate the components of the line of best fit of the form y=mx+c, where m=slope and c=y-intercept. The r_value is the regression coefficient, the p_value s a constant usually zero, while std_err is the error of estimation. ggplot(dataset,aesthetics(y,x)- Gives us a blank coordinate system geom_points- Plots the dataset on the blank plot. scale_y/x_continous- Use to give name and range of the axis. geom_abline- Draw a line of form y=mx+c.
  • 19. OUTPUT: HOW IT WORKS: 1. ggplot is invoked. 2. A blank coordinate system with labeled axes is put up. 3. The points are plotted. 4. The axis redefined and cropped. 5. The line draw as another layer on top of the points.
  • 20. PRICE EVALUATION: Diamonds are expensive! Let us try to map what factors make them so.
  • 21. PRICE VS BREADTH NOTE: labs-use to label the graph and the axises. x-lab and y-lab can also be separately used. stats_smooth provides a mechanism to plot the line of regression and help determine the relation among the variants.
  • 24. PRICE VS VOLUME NOTE: Geom-jitter Over-plotting hides the number of points in each neighbourhood. We can reduce this problem by making the points more transparent.
  • 25. PRICE VS TABLE NOTE: A horizontal line of regression means that value of f(x) can be calculated without much consideration of the value of x. Thus, price is not considerably affected by table and can be calculated without taking table into account.
  • 28. NOTE: A quadratic line of regression signifies that value of price depends on the value of carat. But is only carat, lets see closely.
  • 29. PRICE vs CARAT NOTE: Since the original Price VS carat graph was not providing us accurate information, we narrow down the scale to a particular section. In the next slide we narrow it down further.
  • 30. NOTE: We see that the plot becomes vertical, i.e for the same value of carat we have varying price. Surely some other factor is controlling it.
  • 31. NOTE: This is plotting the price with respect to the cut. We see that for a given carat value the quality of changes the price.
  • 34. Differentiate price VS carat with respect to clarity.
  • 35. NOTE: Facets- It features the same set of data with respect to a given factor. This helps us determine which value of factor affects f(x) the most, FACETS
  • 37. This presentation is a part of the larger pool of learning resources provided by DecisionStats.org FURTHER SOURCES: 1. https://decisionstats.org 2. https://github.com/SolomonMg/diamonds-data 3. https://themessier.wordpress.com/2015/06/17/ggplot-in-python-part-1 4. http://nbviewer.ipython.org/gist/sara-02/d5a61234ef32e60bddda 5. http://nbviewer.ipython.org/gist/sara-02/d38da4a2023da169ac13 6. https://gist.github.com/sara-02/4eb520fd1b82521e8a11 CITATION: 1. http://ggplot.yhathq.com/ 2. https://github.com/yhat/ggplot FOR QUERIES: info@decisionstats.org sarahmasud02@gmail.com