SlideShare a Scribd company logo
1 of 33
Created by The Curiosity Bits Blog (curiositybits.com)
Download the Python code used in the tutorial
Codes provided by Dr. Gregory D. Saxton
Mining Twitter User Profile on
Python
1
Prerequisite
Setting up API keys: pg.4-6
Installing necessary Python libraries: pg.7-8
Creating a list ofTwitter screen-names: pg.9
Setting up a SQLite Database to storeTwitter data: pg.10-14
But, if you are a Python newbie, so let’s start with the
very basics.
2
We assume you are a Python newbie, so let’s start with the
very basics.
• Choosing the right Python platform: Python is a programing
language, but you can use different software packages to write, edit
and run Python codes. We choose Anaconda which is free to
download, and the Python version is 2.7.
• Once you install Anaconda, you can play around Python codes in
Spyder
3
Setting up API keys
• We need keys to getTwitter data throughTwitter API
(https://dev.twitter.com/).You need: API Key, API Secret, Access token,
Access token secret.
• First, go to https://dev.twitter.com/, and sign in yourTwitter account. Go
to my applications page to create an application.
4
Enter any name that makes sense to
you
Enter any text that makes sense to
you
you can enter any legitimate URL, here, I put in
the URL of my institution.
Same as above, you can enter any legitimate URL,
here, I put in the URL of my institution.
Setting up API keys
5
• After creating the app, go to API Keys page, scroll down to the
bottom and click Create my access token. Wait for a few minutes
and refresh the page, then you get all your keys!
Setting up API keys
you need API Key, API Secret, Access token, Access token secret.
6
Installing necessary Python libraries
Think of Python libraries as the apps running on your operating
system.To use our code, you need the following libraries:
• Simplejson (https://pypi.python.org/pypi/simplejson)
• Sqlite3 (http://sqlite.org/)
• Sqlalchemy (http://www.sqlalchemy.org/)
• Twython
(https://twython.readthedocs.org/en/latest/index.html)
7
Installing necessary Python libraries
To install the libraries, go to Start menu and type in CMD and run the CMD file as
administrator. Once you are on CMD, type in the command line pip install, followed by the
name of Python library. For example, to install Twython, you need to type pip install
twython, and press enter. Use this procedure to Install all necessary libraries.
8
• Our Python code enables gathering profile information for multiple
Twitter users. So, first let’s create a list of users.The list should be in
.csv format and contains three columns (in accordance to the
configuration in our Python code). Specially, it looks like this:
Creating a list ofTwitter screen-names
The first column lists sequential
numbers
the second column listsTwitter
screen-names you are interested
in
For the third column, I entered 1
all throughout, but you can leave
it blank.
9
Setting up a SQLite Database to storeTwitter data
You need a storage for incoming data fromTwitterAPI.That
is what databases are for.We use SQLite, a Python library
based on SQL. SQL is a common relational database
management system (RDBMS). In previous steps, you have
installed this sqlite library (sqlite3). On top of that, you can
download a database browser to view and edit the database
just like an Excel file.
Go to http://sqlitebrowser.sourceforge.net/ and download
SQLite Database Browser. It allows you to view and edit
SQLite databases. 10
Setting up a SQLite Database to storeTwitter data
Once you have the files downloaded, run the following file.
11
Setting up a SQLite Database to storeTwitter data
Now, we need to import theTwitter users list into a SQLite database.To do that,
create a new database. Remember the database file name because we need to
write that into Python code.
The default file extension for sqlite is .sqlite, to prevent future complications,
add the extension .sqlite when you save a file in SQLite database browser,.
12
File-Import-Table From CSV File, import the
.csv file you saved. Name the imported table as
accounts.This table name corresponds to the
one we will use in Python code. After you click
create, the csv list will be loaded into the
database, and you can browse it in Browse
Data. Lastly, remember to save the database.
Setting up a SQLite Database to storeTwitter data
Stay on the database file you just created.
13
Setting up a SQLite Database to storeTwitter data
Now, we need to modify the imported table.
Go to Edit-ModifyTables, then use Edit field
to change column names.To correspond to our
Python code, name the first column as rowed,
and FiledType as Integer; the second column
as screen_name, and Field type String, and the
third as user_type, and String. In the end, the
database table is defined as the screen-shoted.
14
Now, moving on to the actual Python code…
Download the Python code, and open it inAnaconda
15
There are only a few places you need to change, but let’s
walk through the code first…
The first block of code is to import necessary Python libraries
Make sure you have
installed all these
necessary libraries
16
The second block is where you need to enter the keys we have obtained in the
beginning. Just copy and paste the keys inside quotation mark.
API Key
API secret
Access token
Access token secret
17
The third block is where we define columns in SQLite database. For now, we do not
need to edit anything here.
18
The fourth block is where we ask the Python code to getTwitter user profile
information based on a list of users already saved in SQLite database. Here, you will
see that table names and the column names correspond to the ones we previously
saved in SQLite.
19
The fifth block is where we make specific request throughTwitter API to
get data:
Here, we ask Python to
get one recent status
from the listed user.This
procedure returns the
user’s profile
information.We will
discuss what profile
information is available
later on.
20
The raw output fromTwitter API is in JSON format. JSON is a standardized way of
storing information. Now we need to map the information in JSON format to the
tables in database. Notice that each column in the database represents aTwitter
output variable.
e.g. A Twitter user’s profile description is
stored as description under user in
JSON. This line of code maps the
profile description in JSON to the
database column named
from_user_description.
21
You need to change the file path and file name here
(RECOMMENDED).
If the Python file and your SQLite database are in the
same folder, just paste your database name here.
22
Now, you are ready to run the code. Go to Run, and choose Execute in a new dedicated
Python interpreter. The first option Execute in current Python or IPython interpreter
does not work on my end, but may be working on your computer.
23
Now, look at the right-side bar in Anaconda.
Oops, looks like I am getting error messages!
ERRORS!!
Don’t panic! Its likely you will hit roadblocks
when you run Python codes. So, it is important
to learn to debug.
For this error, it is likely because I saved the
Python file in a folder that is not a default
Python folder.
But what is default Python folder ?
24
the simple way to find out your default
Python folder is
• On a WINDOWS machine, In Start menu, right-click the Computer
and choose Properties
25
Folders listed
here are your
default Python
folders.
26
In my case, C:AnacondaLibsite-packages is my default Python folder. So I moved the
Python code there, edited the file path in the code, and ran it. Here you go, the code is
running and is getting what we want! If you go check the database file, you will see a
new table named typhoon is created (you can change the table name in the Python
code), and it includes the listed users’ recent tweets and profile information.
27
Oops! Error again!
Twitter API has rate limit.
Based on the version ofTwitter API in our
Python code, you can get 300ish users per
15 minutes. Once you hit the limit, you
will see the error message shown in the
screenshot.
There are two ways to deal with the
restriction:
1. wait for 15 minutes for another run;
2. create multipleTwitter apps and get
multiple keys. Once you use up the quota
in one run, paste in a new key to start a
new run!
28
If putting 0 here, the code starts with the user listed in the first row.
Because we will hit rate limit, you will need to run the code multiple times
to complete crawling all users on the list. Make sure to change the starting
row number!
For example, in the first run, you get user (0) to user (150), and hit rate
limit.You should put 151 in the second run to start with the user listed on
the 150th row. 29
A list ofTwitter output variables
Go to SQLite Database Browser and select the table typhoon (again, this is the name we
gave in Python code).You will see output variables across columns.
30
A list ofTwitter output variables
Some key variables related to user profile:
• from_user_screen_name: user’sTwitter screen-name
• from_user_followers_count: how many people are following the user
• from_user_friends_count: how many people this user is following
• from_user_listed_count: how many times the user is listed in other users’ public
lists
• from_user_favourites_count: how many times the user is favored (liked) by
other users
• from_user_statuses_count: how many tweets has the user sent
• from_user_description: the user’s profile bio
• from_user_location: location
• from_user_created_at: when is the account created
31
A list ofTwitter output variables
File – Export –Table as CSV to export the data into csv. format. Make sure to
add the .csv file extension name.
32
Please send your questions and comments to
weiaixu [at] buffalo dot edu
33

More Related Content

What's hot

Android Presentation
Android Presentation Android Presentation
Android Presentation Nik Sharma
 
Corporate Secret Challenge - CyberDefenders.org by Azad
Corporate Secret Challenge - CyberDefenders.org by AzadCorporate Secret Challenge - CyberDefenders.org by Azad
Corporate Secret Challenge - CyberDefenders.org by AzadAzad Mzuri
 
Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia Alik
 
Introduction to Web Scraping with Python
Introduction to Web Scraping with PythonIntroduction to Web Scraping with Python
Introduction to Web Scraping with PythonOlga Scrivner
 
ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from FacebookESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from FacebookMiriam Fernandez
 
Installing Python on Windows OS
Installing Python on Windows OSInstalling Python on Windows OS
Installing Python on Windows OSWei-Wen Hsu
 

What's hot (8)

Android Presentation
Android Presentation Android Presentation
Android Presentation
 
Corporate Secret Challenge - CyberDefenders.org by Azad
Corporate Secret Challenge - CyberDefenders.org by AzadCorporate Secret Challenge - CyberDefenders.org by Azad
Corporate Secret Challenge - CyberDefenders.org by Azad
 
R project(Analyze Twitter with R)
R project(Analyze Twitter with R)R project(Analyze Twitter with R)
R project(Analyze Twitter with R)
 
Browser Extensions
Browser ExtensionsBrowser Extensions
Browser Extensions
 
Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)
 
Introduction to Web Scraping with Python
Introduction to Web Scraping with PythonIntroduction to Web Scraping with Python
Introduction to Web Scraping with Python
 
ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from FacebookESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
 
Installing Python on Windows OS
Installing Python on Windows OSInstalling Python on Windows OS
Installing Python on Windows OS
 

Viewers also liked

Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
Mining Social Web APIs with IPython Notebook - Data Day Texas 2014Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
Mining Social Web APIs with IPython Notebook - Data Day Texas 2014Matthew Russell
 
Predicting opinion leadership on twitter
Predicting opinion leadership on twitter   Predicting opinion leadership on twitter
Predicting opinion leadership on twitter Weiai Wayne Xu
 
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR) How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR) Weiai Wayne Xu
 
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social MediaPredicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social MediaWeiai Wayne Xu
 
Network Structures For A Better Twitter Community
Network Structures For A Better Twitter CommunityNetwork Structures For A Better Twitter Community
Network Structures For A Better Twitter CommunityWeiai Wayne Xu
 
Slideshare tutorial
Slideshare tutorialSlideshare tutorial
Slideshare tutorialMargie C
 
Basic tutorial how to use slideshare
Basic tutorial how to use slideshareBasic tutorial how to use slideshare
Basic tutorial how to use slideshareCherrylin Ramos
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wranglingjakehofman
 

Viewers also liked (8)

Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
Mining Social Web APIs with IPython Notebook - Data Day Texas 2014Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
 
Predicting opinion leadership on twitter
Predicting opinion leadership on twitter   Predicting opinion leadership on twitter
Predicting opinion leadership on twitter
 
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR) How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
 
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social MediaPredicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
 
Network Structures For A Better Twitter Community
Network Structures For A Better Twitter CommunityNetwork Structures For A Better Twitter Community
Network Structures For A Better Twitter Community
 
Slideshare tutorial
Slideshare tutorialSlideshare tutorial
Slideshare tutorial
 
Basic tutorial how to use slideshare
Basic tutorial how to use slideshareBasic tutorial how to use slideshare
Basic tutorial how to use slideshare
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wrangling
 

Similar to Curiosity Bits Tutorial: Mining Twitter User Profile on Python V2

OpenWhisk by Example - Auto Retweeting Example in Python
OpenWhisk by Example - Auto Retweeting Example in PythonOpenWhisk by Example - Auto Retweeting Example in Python
OpenWhisk by Example - Auto Retweeting Example in PythonCodeOps Technologies LLP
 
Fundamentals of python
Fundamentals of pythonFundamentals of python
Fundamentals of pythonBijuAugustian
 
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdfCSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdfAbdulmalikAhmadLawan2
 
Introduction to Python.pdf
Introduction to Python.pdfIntroduction to Python.pdf
Introduction to Python.pdfRahul Mogal
 
Python Requirements File How to Create Python requirements.txt
Python Requirements File How to Create Python requirements.txtPython Requirements File How to Create Python requirements.txt
Python Requirements File How to Create Python requirements.txtInexture Solutions
 
Openpicus Flyport interfaces the cloud services
Openpicus Flyport interfaces the cloud servicesOpenpicus Flyport interfaces the cloud services
Openpicus Flyport interfaces the cloud servicesIonela
 
unit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptxunit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptxusvirat1805
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptxKaviya452563
 
python-160403194316.pdf
python-160403194316.pdfpython-160403194316.pdf
python-160403194316.pdfgmadhu8
 
Python Seminar PPT
Python Seminar PPTPython Seminar PPT
Python Seminar PPTShivam Gupta
 

Similar to Curiosity Bits Tutorial: Mining Twitter User Profile on Python V2 (20)

OpenWhisk by Example - Auto Retweeting Example in Python
OpenWhisk by Example - Auto Retweeting Example in PythonOpenWhisk by Example - Auto Retweeting Example in Python
OpenWhisk by Example - Auto Retweeting Example in Python
 
PYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdfPYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdf
 
Fundamentals of python
Fundamentals of pythonFundamentals of python
Fundamentals of python
 
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdfCSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
 
Python fundamentals
Python fundamentalsPython fundamentals
Python fundamentals
 
Introduction to Python.pdf
Introduction to Python.pdfIntroduction to Python.pdf
Introduction to Python.pdf
 
Python Requirements File How to Create Python requirements.txt
Python Requirements File How to Create Python requirements.txtPython Requirements File How to Create Python requirements.txt
Python Requirements File How to Create Python requirements.txt
 
Openpicus Flyport interfaces the cloud services
Openpicus Flyport interfaces the cloud servicesOpenpicus Flyport interfaces the cloud services
Openpicus Flyport interfaces the cloud services
 
unit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptxunit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptx
 
Week 1.pptx
Week 1.pptxWeek 1.pptx
Week 1.pptx
 
Introduction to python3.pdf
Introduction to python3.pdfIntroduction to python3.pdf
Introduction to python3.pdf
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptx
 
01 python introduction
01 python introduction 01 python introduction
01 python introduction
 
Core python programming tutorial
Core python programming tutorialCore python programming tutorial
Core python programming tutorial
 
Intro to python
Intro to pythonIntro to python
Intro to python
 
python-160403194316.pdf
python-160403194316.pdfpython-160403194316.pdf
python-160403194316.pdf
 
python into.pptx
python into.pptxpython into.pptx
python into.pptx
 
Python PPT.pptx
Python PPT.pptxPython PPT.pptx
Python PPT.pptx
 
Python Seminar PPT
Python Seminar PPTPython Seminar PPT
Python Seminar PPT
 
Python
PythonPython
Python
 

More from Weiai Wayne Xu

Big data, small data and everything in between
Big data, small data and everything in betweenBig data, small data and everything in between
Big data, small data and everything in betweenWeiai Wayne Xu
 
Say search and sales e-cigar and big data
Say search and sales   e-cigar and big data Say search and sales   e-cigar and big data
Say search and sales e-cigar and big data Weiai Wayne Xu
 
The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0Weiai Wayne Xu
 
The Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTubeThe Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTubeWeiai Wayne Xu
 
What makes an image worth a thousand words NCA2014
What makes an image worth a thousand words   NCA2014What makes an image worth a thousand words   NCA2014
What makes an image worth a thousand words NCA2014Weiai Wayne Xu
 

More from Weiai Wayne Xu (6)

Big data, small data and everything in between
Big data, small data and everything in betweenBig data, small data and everything in between
Big data, small data and everything in between
 
Say search and sales e-cigar and big data
Say search and sales   e-cigar and big data Say search and sales   e-cigar and big data
Say search and sales e-cigar and big data
 
Xu talk 3-17-2015
Xu talk 3-17-2015Xu talk 3-17-2015
Xu talk 3-17-2015
 
The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0
 
The Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTubeThe Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTube
 
What makes an image worth a thousand words NCA2014
What makes an image worth a thousand words   NCA2014What makes an image worth a thousand words   NCA2014
What makes an image worth a thousand words NCA2014
 

Recently uploaded

What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 

Recently uploaded (20)

What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 

Curiosity Bits Tutorial: Mining Twitter User Profile on Python V2

  • 1. Created by The Curiosity Bits Blog (curiositybits.com) Download the Python code used in the tutorial Codes provided by Dr. Gregory D. Saxton Mining Twitter User Profile on Python 1
  • 2. Prerequisite Setting up API keys: pg.4-6 Installing necessary Python libraries: pg.7-8 Creating a list ofTwitter screen-names: pg.9 Setting up a SQLite Database to storeTwitter data: pg.10-14 But, if you are a Python newbie, so let’s start with the very basics. 2
  • 3. We assume you are a Python newbie, so let’s start with the very basics. • Choosing the right Python platform: Python is a programing language, but you can use different software packages to write, edit and run Python codes. We choose Anaconda which is free to download, and the Python version is 2.7. • Once you install Anaconda, you can play around Python codes in Spyder 3
  • 4. Setting up API keys • We need keys to getTwitter data throughTwitter API (https://dev.twitter.com/).You need: API Key, API Secret, Access token, Access token secret. • First, go to https://dev.twitter.com/, and sign in yourTwitter account. Go to my applications page to create an application. 4
  • 5. Enter any name that makes sense to you Enter any text that makes sense to you you can enter any legitimate URL, here, I put in the URL of my institution. Same as above, you can enter any legitimate URL, here, I put in the URL of my institution. Setting up API keys 5
  • 6. • After creating the app, go to API Keys page, scroll down to the bottom and click Create my access token. Wait for a few minutes and refresh the page, then you get all your keys! Setting up API keys you need API Key, API Secret, Access token, Access token secret. 6
  • 7. Installing necessary Python libraries Think of Python libraries as the apps running on your operating system.To use our code, you need the following libraries: • Simplejson (https://pypi.python.org/pypi/simplejson) • Sqlite3 (http://sqlite.org/) • Sqlalchemy (http://www.sqlalchemy.org/) • Twython (https://twython.readthedocs.org/en/latest/index.html) 7
  • 8. Installing necessary Python libraries To install the libraries, go to Start menu and type in CMD and run the CMD file as administrator. Once you are on CMD, type in the command line pip install, followed by the name of Python library. For example, to install Twython, you need to type pip install twython, and press enter. Use this procedure to Install all necessary libraries. 8
  • 9. • Our Python code enables gathering profile information for multiple Twitter users. So, first let’s create a list of users.The list should be in .csv format and contains three columns (in accordance to the configuration in our Python code). Specially, it looks like this: Creating a list ofTwitter screen-names The first column lists sequential numbers the second column listsTwitter screen-names you are interested in For the third column, I entered 1 all throughout, but you can leave it blank. 9
  • 10. Setting up a SQLite Database to storeTwitter data You need a storage for incoming data fromTwitterAPI.That is what databases are for.We use SQLite, a Python library based on SQL. SQL is a common relational database management system (RDBMS). In previous steps, you have installed this sqlite library (sqlite3). On top of that, you can download a database browser to view and edit the database just like an Excel file. Go to http://sqlitebrowser.sourceforge.net/ and download SQLite Database Browser. It allows you to view and edit SQLite databases. 10
  • 11. Setting up a SQLite Database to storeTwitter data Once you have the files downloaded, run the following file. 11
  • 12. Setting up a SQLite Database to storeTwitter data Now, we need to import theTwitter users list into a SQLite database.To do that, create a new database. Remember the database file name because we need to write that into Python code. The default file extension for sqlite is .sqlite, to prevent future complications, add the extension .sqlite when you save a file in SQLite database browser,. 12
  • 13. File-Import-Table From CSV File, import the .csv file you saved. Name the imported table as accounts.This table name corresponds to the one we will use in Python code. After you click create, the csv list will be loaded into the database, and you can browse it in Browse Data. Lastly, remember to save the database. Setting up a SQLite Database to storeTwitter data Stay on the database file you just created. 13
  • 14. Setting up a SQLite Database to storeTwitter data Now, we need to modify the imported table. Go to Edit-ModifyTables, then use Edit field to change column names.To correspond to our Python code, name the first column as rowed, and FiledType as Integer; the second column as screen_name, and Field type String, and the third as user_type, and String. In the end, the database table is defined as the screen-shoted. 14
  • 15. Now, moving on to the actual Python code… Download the Python code, and open it inAnaconda 15
  • 16. There are only a few places you need to change, but let’s walk through the code first… The first block of code is to import necessary Python libraries Make sure you have installed all these necessary libraries 16
  • 17. The second block is where you need to enter the keys we have obtained in the beginning. Just copy and paste the keys inside quotation mark. API Key API secret Access token Access token secret 17
  • 18. The third block is where we define columns in SQLite database. For now, we do not need to edit anything here. 18
  • 19. The fourth block is where we ask the Python code to getTwitter user profile information based on a list of users already saved in SQLite database. Here, you will see that table names and the column names correspond to the ones we previously saved in SQLite. 19
  • 20. The fifth block is where we make specific request throughTwitter API to get data: Here, we ask Python to get one recent status from the listed user.This procedure returns the user’s profile information.We will discuss what profile information is available later on. 20
  • 21. The raw output fromTwitter API is in JSON format. JSON is a standardized way of storing information. Now we need to map the information in JSON format to the tables in database. Notice that each column in the database represents aTwitter output variable. e.g. A Twitter user’s profile description is stored as description under user in JSON. This line of code maps the profile description in JSON to the database column named from_user_description. 21
  • 22. You need to change the file path and file name here (RECOMMENDED). If the Python file and your SQLite database are in the same folder, just paste your database name here. 22
  • 23. Now, you are ready to run the code. Go to Run, and choose Execute in a new dedicated Python interpreter. The first option Execute in current Python or IPython interpreter does not work on my end, but may be working on your computer. 23
  • 24. Now, look at the right-side bar in Anaconda. Oops, looks like I am getting error messages! ERRORS!! Don’t panic! Its likely you will hit roadblocks when you run Python codes. So, it is important to learn to debug. For this error, it is likely because I saved the Python file in a folder that is not a default Python folder. But what is default Python folder ? 24
  • 25. the simple way to find out your default Python folder is • On a WINDOWS machine, In Start menu, right-click the Computer and choose Properties 25
  • 26. Folders listed here are your default Python folders. 26
  • 27. In my case, C:AnacondaLibsite-packages is my default Python folder. So I moved the Python code there, edited the file path in the code, and ran it. Here you go, the code is running and is getting what we want! If you go check the database file, you will see a new table named typhoon is created (you can change the table name in the Python code), and it includes the listed users’ recent tweets and profile information. 27
  • 28. Oops! Error again! Twitter API has rate limit. Based on the version ofTwitter API in our Python code, you can get 300ish users per 15 minutes. Once you hit the limit, you will see the error message shown in the screenshot. There are two ways to deal with the restriction: 1. wait for 15 minutes for another run; 2. create multipleTwitter apps and get multiple keys. Once you use up the quota in one run, paste in a new key to start a new run! 28
  • 29. If putting 0 here, the code starts with the user listed in the first row. Because we will hit rate limit, you will need to run the code multiple times to complete crawling all users on the list. Make sure to change the starting row number! For example, in the first run, you get user (0) to user (150), and hit rate limit.You should put 151 in the second run to start with the user listed on the 150th row. 29
  • 30. A list ofTwitter output variables Go to SQLite Database Browser and select the table typhoon (again, this is the name we gave in Python code).You will see output variables across columns. 30
  • 31. A list ofTwitter output variables Some key variables related to user profile: • from_user_screen_name: user’sTwitter screen-name • from_user_followers_count: how many people are following the user • from_user_friends_count: how many people this user is following • from_user_listed_count: how many times the user is listed in other users’ public lists • from_user_favourites_count: how many times the user is favored (liked) by other users • from_user_statuses_count: how many tweets has the user sent • from_user_description: the user’s profile bio • from_user_location: location • from_user_created_at: when is the account created 31
  • 32. A list ofTwitter output variables File – Export –Table as CSV to export the data into csv. format. Make sure to add the .csv file extension name. 32
  • 33. Please send your questions and comments to weiaixu [at] buffalo dot edu 33