It is the slides for COSCUP[1] 2013 Hands-on[2], "Learning Python from Data".
It aims for using examples to show the world of Python. Hope it will help you with learning Python.
[1] COSCUP: http://coscup.org/
[2] COSCUP Hands-on: http://registrano.com/events/coscup-2013-hands-on-mosky
2. THIS SLIDE
• The online version is at
https://speakerdeck.com/mosky/learning-python-from-data.
• The examples are at
https://github.com/moskytw/learning-python-from-data-examples.
2
5. MOSKY
• I am working at Pinkoi.
• I've taught Python for 100+ hours.
3
6. MOSKY
• I am working at Pinkoi.
• I've taught Python for 100+ hours.
• A speaker at
COSCUP 2014, PyCon SG 2014, PyCon APAC 014,
OSDC 2014, PyCon APAC 2013, COSCUP 2014, ...
3
7. MOSKY
• I am working at Pinkoi.
• I've taught Python for 100+ hours.
• A speaker at
COSCUP 2014, PyCon SG 2014, PyCon APAC 014,
OSDC 2014, PyCon APAC 2013, COSCUP 2014, ...
• The author of the Python packages:
MoSQL, Clime, ZIPCodeTW, ...
3
8. MOSKY
• I am working at Pinkoi.
• I've taught Python for 100+ hours.
• A speaker at
COSCUP 2014, PyCon SG 2014, PyCon APAC 014,
OSDC 2014, PyCon APAC 2013, COSCUP 2014, ...
• The author of the Python packages:
MoSQL, Clime, ZIPCodeTW, ...
• http://mosky.tw/
3
12. SCHEDULE
•Warm-up
• Packages - Install the packages we need.
• CSV - Download a CSV from the Internet and handle it.
4
13. SCHEDULE
•Warm-up
• Packages - Install the packages we need.
• CSV - Download a CSV from the Internet and handle it.
• HTML - Parse a HTML source code and write a Web crawler.
4
14. SCHEDULE
•Warm-up
• Packages - Install the packages we need.
• CSV - Download a CSV from the Internet and handle it.
• HTML - Parse a HTML source code and write a Web crawler.
• SQL - Save data into a SQLite database.
4
15. SCHEDULE
•Warm-up
• Packages - Install the packages we need.
• CSV - Download a CSV from the Internet and handle it.
• HTML - Parse a HTML source code and write a Web crawler.
• SQL - Save data into a SQLite database.
• The End
4
21. 2 OR 3?
• Use Python 3!
• But it actually depends on the libs you need.
7
22. 2 OR 3?
• Use Python 3!
• But it actually depends on the libs you need.
• https://python3wos.appspot.com/
7
23. 2 OR 3?
• Use Python 3!
• But it actually depends on the libs you need.
• https://python3wos.appspot.com/
•We will go ahead with Python 2.7,
but I will also introduce the changes in Python 3.
7
25. THE ONLINE RESOURCES
• The Python Official Doc
• http://docs.python.org
• The Python Tutorial
• The Python Standard
Library
8
26. THE ONLINE RESOURCES
• The Python Official Doc
• http://docs.python.org
• The Python Tutorial
• The Python Standard
Library
• My Past Slides
• Programming with Python
- Basic
• Programming with Python
- Adv.
8
33. PREPARATION
• Did you say "hello" to Python?
• If no, visit
• http://www.slideshare.net/moskytw/programming-with-python-
basic.
10
34. PREPARATION
• Did you say "hello" to Python?
• If no, visit
• http://www.slideshare.net/moskytw/programming-with-python-
basic.
• If yes, open your Python shell.
10
36. MATH & VARS
2 + 3
2 - 3
2 * 3
2 / 3, -2 / 3
!
(1+10)*10 / 2
!
2.0 / 3
!
2 % 3
!
2 ** 3
x = 2
!
y = 3
!
z = x + y
!
print z
!
'#' * 10
12
37. FOR
for i in [0, 1, 2, 3, 4]:
print i
!
items = [0, 1, 2, 3, 4]
for i in items:
print i
!
for i in range(5):
print i
!
!
!
chars = 'SAHFI'
for i, c in enumerate(chars):
print i, c
!
!
words = ('Samsung', 'Apple',
'HP', 'Foxconn', 'IBM')
for c, w in zip(chars, words):
print c, w
13
38. IF
for i in range(1, 10):
if i % 2 == 0:
print '{} is divisible by 2'.format(i)
elif i % 3 == 0:
print '{} is divisible by 3'.format(i)
else:
print '{} is not divisible by 2 nor 3'.format(i)
14
39. WHILE
while 1:
n = int(raw_input('How big pyramid do you want? '))
if n <= 0:
print 'It must greater than 0: {}'.format(n)
continue
break
15
40. TRY
while 1:
!
try:
n = int(raw_input('How big pyramid do you want? '))
except ValueError as e:
print 'It must be a number: {}'.format(e)
continue
!
if n <= 0:
print 'It must greater than 0: {}'.format(n)
continue
!
break
16
41. LOOP ... ELSE
for n in range(2, 100):
for i in range(2, n):
if n % i == 0:
break
else:
print '{} is a prime!'.format(n)
17
53. GET PIP - WIN *
• Follow the steps in http://stackoverflow.com/questions/
4750806/how-to-install-pip-on-windows.
25
54. GET PIP - WIN *
• Follow the steps in http://stackoverflow.com/questions/
4750806/how-to-install-pip-on-windows.
• Or just use easy_install to install.
The easy_install should be found at C:Python27Scripts.
25
55. GET PIP - WIN *
• Follow the steps in http://stackoverflow.com/questions/
4750806/how-to-install-pip-on-windows.
• Or just use easy_install to install.
The easy_install should be found at C:Python27Scripts.
• Or find the Windows installer on Python Package Index.
25
58. 3-RD PARTY PACKAGES
• requests - Python HTTP for Humans
• lxml - Pythonic XML processing library
26
59. 3-RD PARTY PACKAGES
• requests - Python HTTP for Humans
• lxml - Pythonic XML processing library
• uniout - Print the object representation in readable chars.
26
60. 3-RD PARTY PACKAGES
• requests - Python HTTP for Humans
• lxml - Pythonic XML processing library
• uniout - Print the object representation in readable chars.
• clime - Convert module into a CLI program w/o any config.
26
64. FILE
save_path = 'school_list.csv'
!
with open(save_path, 'w') as f:
f.write(requests.get(url).content)
!
with open(save_path) as f:
print f.read()
!
with open(save_path) as f:
for line in f:
print line,
30
65. DEF
from os.path import basename
!
def save(url, path=None):
!
if not path:
path = basename(url)
!
with open(path, 'w') as f:
f.write(requests.get(url).content)
31
66. CSV
import csv
from os.path import exists
!
if not exists(save_path):
save(url, save_path)
!
with open(save_path) as f:
for row in csv.reader(f):
print row
32
67. + UNIOUT
import csv
from os.path import exists
import uniout # You want this!
!
if not exists(save_path):
save(url, save_path)
!
with open(save_path) as f:
for row in csv.reader(f):
print row
33
68. NEXT
with open(save_path) as f:
next(f) # skip the unwanted lines
next(f)
for row in csv.reader(f):
print row
34
69. DICT READER
with open(save_path) as f:
next(f)
next(f)
for row in csv.DictReader(f):
print row
!
# We now have a great output. :)
35
70. DEF AGAIN
def parse_to_school_list(path):
school_list = []
with open(path) as f:
next(f)
next(f)
for school in csv.DictReader(f):
school_list.append(school)
!
return school_list[:-2]
36
71. + COMPREHENSION
def parse_to_school_list(path='schools.csv'):
with open(path) as f:
next(f)
next(f)
school_list = [school for school in
csv.DictReader(f)][:-2]
!
return school_list
37
73. PYTHONIC
school_list = parse_to_school_list(save_path)
!
# hmmm ...
!
for school in shcool_list:
print shcool['School Name']
!
# It is more Pythonic! :)
!
print [school['School Name'] for school in school_list]
39
74. GROUP BY
from itertools import groupby
!
# You MUST sort it.
keyfunc = lambda school: school['County']
school_list.sort(key=keyfunc)
!
for county, schools in groupby(school_list, keyfunc):
for school in schools:
print '%s %r' % (county, school)
print '---'
40
75. DOCSTRING
'''It contains some useful function for paring data
from government.'''
!
def save(url, path=None):
'''It saves data from `url` to `path`.'''
...
!
--- Shell ---
!
$ pydoc csv_docstring
41
76. CLIME
if __name__ == '__main__':
import clime.now
!
--- shell ---
!
$ python csv_clime.py
usage: basename <p>
or: parse-to-school-list <path>
or: save [--path] <url>
!
It contains some userful function for parsing data from
government.
42
83. XPATH
titles = root.xpath('/html/head/title')
print titles[0].text
!
title_texts = root.xpath('/html/head/title/text()')
print title_texts[0]
!
as_ = root.xpath('//a')
print as_
print [a.get('href') for a in as_]
49
84. MD5
from hashlib import md5
!
message = 'There should be one-- and preferably
only one --obvious way to do it.'
!
print md5(message).hexdigest()
!
# Actually, it is noting about HTML.
50
85. DEF GET
from os import makedirs
from os.path import exists, join
!
def get(url, cache_dir_path='cache/'):
!
if not exists(cache_dir_path):
makedirs(cache_dir)
!
cache_path = join(cache_dir_path,
md5(url).hexdigest())
!
...
51
86. DEF FIND_URLS
def find_urls(content):
root = etree.HTML(content)
return [
a.attrib['href'] for a in root.xpath('//a')
if 'href' in a.attrib
]
52
93. TABLE
CREATE TABLE schools (
id TEXT PRIMARY KEY,
name TEXT,
county TEXT,
address TEXT,
phone TEXT,
url TEXT,
type TEXT
);
!
DROP TABLE schools;
59
94. CRUD
INSERT INTO schools (id, name) VALUES ('1', 'The
First');
INSERT INTO schools VALUES (...);
!
SELECT * FROM schools WHERE id='1';
SELECT name FROM schools WHERE id='1';
!
UPDATE schools SET id='10' WHERE id='1';
!
DELETE FROM schools WHERE id='10';
60
99. FETCH
...
cur.execute('select * from schools')
!
print cur.fetchone()
!
# or
print cur.fetchall()
!
# or
for row in cur:
print row
...
65
100. TEXT FACTORY
# SQLite only: Let you pass the 8-bit string as parameter.
!
...
!
conn = sqlite3.connect(db_path)
conn.text_factory = str
!
...
66
101. ROW FACTORY
# SQLite only: Let you convert tuple into dict. It is
`DictCursor` in some other connectors.
!
def dict_factory(cursor, row):
d = {}
for idx, col in enumerate(cursor.description):
d[col[0]] = row[idx]
return d
!
...
con.row_factory = dict_factory
...
67
104. MORE
• Python DB API 2.0
• MySQLdb - MySQL connector for Python
68
105. MORE
• Python DB API 2.0
• MySQLdb - MySQL connector for Python
• Psycopg2 - PostgreSQL adapter for Python
68
106. MORE
• Python DB API 2.0
• MySQLdb - MySQL connector for Python
• Psycopg2 - PostgreSQL adapter for Python
• SQLAlchemy - the Python SQL toolkit and ORM
68
107. MORE
• Python DB API 2.0
• MySQLdb - MySQL connector for Python
• Psycopg2 - PostgreSQL adapter for Python
• SQLAlchemy - the Python SQL toolkit and ORM
• MoSQL - Build SQL from common Python data structure.
68
110. THE END
• You learned how to ...
• make a HTTP request
69
111. THE END
• You learned how to ...
• make a HTTP request
• load a CSV file
69
112. THE END
• You learned how to ...
• make a HTTP request
• load a CSV file
• parse a HTML file
69
113. THE END
• You learned how to ...
• make a HTTP request
• load a CSV file
• parse a HTML file
• write a Web crawler
69
114. THE END
• You learned how to ...
• make a HTTP request
• load a CSV file
• parse a HTML file
• write a Web crawler
• use SQL with SQLite
69
115. THE END
• You learned how to ...
• make a HTTP request
• load a CSV file
• parse a HTML file
• write a Web crawler
• use SQL with SQLite
• and lot of techniques today. ;)
69