SlideShare a Scribd company logo
1 of 56
Download to read offline
Python & Stuff

               All the things I like about Python, plus a bit more.




Friday, November 4, 11
Jacob Perkins
                         Python Text Processing with NLTK 2.0 Cookbook

                         Co-Founder & CTO @weotta

                         Blog: http://streamhacker.com

                         NLTK Demos: http://text-processing.com

                         @japerk

                         Python user for > 6 years



Friday, November 4, 11
What I use Python for

                          web development with Django

                          web crawling with Scrapy

                          NLP with NLTK

                          argparse based scripts

                          processing data in Redis & MongoDB



Friday, November 4, 11
Topics
                         functional programming

                         I/O

                         Object Oriented programming

                         scripting

                         testing

                         remoting

                         parsing

                         package management

                         data storage

                         performance
Friday, November 4, 11
Functional Programming
                         list comprehensions

                         slicing

                         iterators

                         generators

                         higher order functions

                         decorators

                         default & optional arguments

                         switch/case emulation
Friday, November 4, 11
List Comprehensions

                         >>>   [i for i in range(10) if i % 2]
                         [1,   3, 5, 7, 9]
                         >>>   dict([(i, i*2) for i in range(5)])
                         {0:   0, 1: 2, 2: 4, 3: 6, 4: 8}
                         >>>   s = set(range(5))
                         >>>   [i for i in range(10) if i in s]
                         [0,   1, 2, 3, 4]




Friday, November 4, 11
Slicing

                         >>>   range(10)[:5]
                         [0,   1, 2, 3, 4]
                         >>>   range(10)[3:5]
                         [3,   4]
                         >>>   range(10)[1:5]
                         [1,   2, 3, 4]
                         >>>   range(10)[::2]
                         [0,   2, 4, 6, 8]
                         >>>   range(10)[-5:-1]
                         [5,   6, 7, 8]



Friday, November 4, 11
Iterators

                         >>> i = iter([1, 2, 3])
                         >>> i.next()
                         1
                         >>> i.next()
                         2
                         >>> i.next()
                         3
                         >>> i.next()
                         Traceback (most recent call last):
                           File "<stdin>", line 1, in <module>
                         StopIteration



Friday, November 4, 11
Generators
                         >>> def gen_ints(n):
                         ...     for i in range(n):
                         ...          yield i
                         ...
                         >>> g = gen_ints(2)
                         >>> g.next()
                         0
                         >>> g.next()
                         1
                         >>> g.next()
                         Traceback (most recent call last):
                           File "<stdin>", line 1, in <module>
                         StopIteration


Friday, November 4, 11
Higher Order Functions

                          >>> def hof(n):
                          ...      def addn(i):
                          ...          return i + n
                          ...      return addn
                          ...
                          >>> f = hof(5)
                          >>> f(3)
                          8




Friday, November 4, 11
Decorators
               >>> def print_args(f):
               ...     def g(*args, **kwargs):
               ...         print args, kwargs
               ...         return f(*args, **kwargs)
               ...     return g
               ...
               >>> @print_args
               ... def add2(n):
               ...     return n+2
               ...
               >>> add2(5)
               (5,) {}
               7
               >>> add2(3)
               (3,) {}
               5
Friday, November 4, 11
Default & Optional Args
               >>> def special_arg(special=None, *args, **kwargs):
               ...     print 'special:', special
               ...     print args
               ...     print kwargs
               ...
               >>> special_arg(special='hi')
               special: hi
               ()
               {}
               >>>
               >>> special_arg('hi')
               special: hi
               ()
               {}

Friday, November 4, 11
switch/case emulation


                             OPTS = {
                                 “a”: all,
                                 “b”: any
                             }

                             def all_or_any(lst, opt):
                                 return OPTS[opt](lst)




Friday, November 4, 11
Object Oriented

                         classes

                         multiple inheritance

                         special methods

                         collections

                         defaultdict



Friday, November 4, 11
Classes
               >>>       class A(object):
               ...           def __init__(self):
               ...                   self.value = 'a'
               ...
               >>>       class B(A):
               ...           def __init__(self):
               ...                   super(B, self).__init__()
               ...                   self.value = 'b'
               ...
               >>>       a = A()
               >>>       a.value
               'a'
               >>>       b = B()
               >>>       b.value
               'b'
Friday, November 4, 11
Multiple Inheritance
               >>>       class B(object):
               ...           def __init__(self):
               ...                   self.value = 'b'
               ...
               >>>       class C(A, B): pass
               ...
               >>>       C().value
               'a'
               >>>       class C(B, A): pass
               ...
               >>>       C().value
               'b'


Friday, November 4, 11
Special Methods

                         __init__

                         __len__

                         __iter__

                         __contains__

                         __getitem__



Friday, November 4, 11
collections

                         high performance containers

                         Abstract Base Classes

                         Iterable, Sized, Sequence, Set, Mapping

                         multi-inherit from ABC to mix & match

                         implement only a few special methods, get
                         rest for free


Friday, November 4, 11
defaultdict
               >>> d = {}
               >>> d['a'] += 2
               Traceback (most recent call last):
                 File "<stdin>", line 1, in <module>
               KeyError: 'a'
               >>> import collections
               >>> d = collections.defaultdict(int)
               >>> d['a'] += 2
               >>> d['a']
               2
               >>> l = collections.defaultdict(list)
               >>> l['a'].append(1)
               >>> l['a']
               [1]

Friday, November 4, 11
I/O


                         context managers

                         file iteration

                         gevent / eventlet




Friday, November 4, 11
Context Managers



               >>> with open('myfile', 'w') as f:
               ...     f.write('hellonworld')
               ...




Friday, November 4, 11
File Iteration


               >>> with open('myfile') as f:
               ...     for line in f:
               ...             print line.strip()
               ...
               hello
               world




Friday, November 4, 11
gevent / eventlet
                         coroutine networking libraries

                         greenlets: “micro-threads”

                         fast event loop

                         monkey-patch standard library

                         http://www.gevent.org/

                         http://www.eventlet.net/


Friday, November 4, 11
Scripting


                         argparse

                         __main__

                         atexit




Friday, November 4, 11
argparse
   import argparse

   parser = argparse.ArgumentParser(description='Train a
   NLTK Classifier')

   parser.add_argument('corpus', help='corpus name/path')
   parser.add_argument('--no-pickle', action='store_true',
     default=False, help="don't pickle")
   parser.add_argument('--trace', default=1, type=int,
     help='How much trace output you want')

   args = parser.parse_args()

   if args.trace:
       print ‘have args’
Friday, November 4, 11
__main__


                         if __name__ == ‘__main__’:
                             do_main_function()




Friday, November 4, 11
atexit

        def goodbye(name, adjective):
            print 'Goodbye, %s, it was %s to meet you.' % (name,
        adjective)

        import atexit
        atexit.register(goodbye, 'Donny', 'nice')




Friday, November 4, 11
Testing

                         doctest

                         unittest

                         nose

                         fudge

                         py.test



Friday, November 4, 11
doctest
                         def fib(n):
                             '''Return the nth fibonacci number.
                             >>> fib(0)
                             0
                             >>> fib(1)
                             1
                             >>> fib(2)
                             1
                             >>> fib(3)
                             2
                             >>> fib(4)
                             3
                             '''
                             if n == 0: return 0
                             elif n == 1: return 1
                             else: return fib(n - 1) + fib(n - 2)
Friday, November 4, 11
doctesting modules



                           if __name__ == ‘__main__’:
                               import doctest
                               doctest.testmod()




Friday, November 4, 11
unittest


                         anything more complicated than function I/O

                         clean state for each test

                         test interactions between components

                         can use mock objects




Friday, November 4, 11
nose

                         http://readthedocs.org/docs/nose/en/latest/

                         test runner

                         auto-discovery of tests

                         easy plugin system

                         plugins can generate XML for CI (Jenkins)



Friday, November 4, 11
fudge


                         http://farmdev.com/projects/fudge/

                         make fake objects

                         mock thru monkey-patching




Friday, November 4, 11
py.test


                         http://pytest.org/latest/

                         similar to nose

                         distributed multi-platform testing




Friday, November 4, 11
Remoting Libraries



                         Fabric

                         execnet




Friday, November 4, 11
Fabric


                         http://fabfile.org

                         run commands over ssh

                         great for “push” deployment

                         not parallel yet




Friday, November 4, 11
fabfile.py
   from fabric.api import run

   def host_type():
       run('uname -s')


                         fab command
   $ fab -H localhost,linuxbox host_type
   [localhost] run: uname -s
   [localhost] out: Darwin
   [linuxbox] run: uname -s
   [linuxbox] out: Linux


Friday, November 4, 11
execnet
                         http://codespeak.net/execnet/

                         open python interpreters over ssh

                         spawn local python interpreters

                         shared-nothing model

                         send code & data over channels

                         interact with CPython, Jython, PyPy

                         py.test distributed testing

Friday, November 4, 11
execnet example
   >>> import execnet, os
   >>> gw = execnet.makegateway("ssh=codespeak.net")
   >>> channel = gw.remote_exec("""
   ...      import sys, os
   ...      channel.send((sys.platform, sys.version_info,
   os.getpid()))
   ... """)
   >>> platform, version_info, remote_pid = channel.receive()
   >>> platform
   'linux2'
   >>> version_info
   (2, 4, 2, 'final', 0)


Friday, November 4, 11
Parsing


                         regular expressions

                         NLTK

                         SimpleParse




Friday, November 4, 11
NLTK Tokenization


          >>> from nltk import tokenize
          >>> tokenize.word_tokenize("Jacob's presentation")
          ['Jacob', "'s", 'presentation']
          >>> tokenize.wordpunct_tokenize("Jacob's presentation")
          ['Jacob', "'", 's', 'presentation']




Friday, November 4, 11
nltk.grammar


                         CFGs

                         Chapter 9 of NLTK Book: http://
                         nltk.googlecode.com/svn/trunk/doc/book/
                         ch09.html




Friday, November 4, 11
more NLTK


                         stemming

                         part-of-speech tagging

                         chunking

                         classification




Friday, November 4, 11
SimpleParse

                         http://simpleparse.sourceforge.net/

                         Parser generator

                         EBNF grammars

                         Based on mxTextTools: http://
                         www.egenix.com/products/python/mxBase/
                         mxTextTools/ (C extensions)



Friday, November 4, 11
Package Management


                         import

                         pip

                         virtualenv

                         mercurial




Friday, November 4, 11
import
                 import module
                 from module import function, ClassName
                 from module import function as f




                         always make sure package directories have
                         __init__.py




Friday, November 4, 11
pip
                          http://www.pip-installer.org/en/latest/

                          easy_install replacement

                          install from requirements files

                         $ pip install simplejson
                         [... progress report ...]
                         Successfully installed simplejson




Friday, November 4, 11
virtualenv


                         http://www.virtualenv.org/en/latest/

                         create self-contained python installations

                         dependency silos

                         works great with pip (same author)




Friday, November 4, 11
mercurial

                         http://mercurial.selenic.com/

                         Python based DVCS

                         simple & fast

                         easy cloning

                         works with Bitbucket, Github, Googlecode



Friday, November 4, 11
Flexible Data Storage



                         Redis

                         MongoDB




Friday, November 4, 11
Redis
                         in-memory key-value storage server

                         most operations O(1)

                         lists

                         sets

                         sorted sets

                         hash objects


Friday, November 4, 11
MongoDB
                         memory mapped document storage

                         arbitrary document fields

                         nested documents

                         index on multiple fields

                         easier (for programmers) than SQL

                         capped collections (good for logging)


Friday, November 4, 11
Python Performance



                         CPU

                         RAM




Friday, November 4, 11
CPU


                         probably fast enough if I/O or DB bound

                         try PyPy: http://pypy.org/

                         use CPython optimized libraries like numpy

                         write a CPython extension




Friday, November 4, 11
RAM


                         don’t keep references longer than needed

                         iterate over data

                         aggregate to an optimized DB




Friday, November 4, 11
import this
                     >>> import this
                     The Zen of Python, by Tim Peters

                     Beautiful is better than ugly.
                     Explicit is better than implicit.
                     Simple is better than complex.
                     Complex is better than complicated.
                     Flat is better than nested.
                     Sparse is better than dense.
                     Readability counts.
                     Special cases aren't special enough to break the rules.
                     Although practicality beats purity.
                     Errors should never pass silently.
                     Unless explicitly silenced.
                     In the face of ambiguity, refuse the temptation to guess.
                     There should be one-- and preferably only one --obvious way to do it.
                     Although that way may not be obvious at first unless you're Dutch.
                     Now is better than never.
                     Although never is often better than *right* now.
                     If the implementation is hard to explain, it's a bad idea.
                     If the implementation is easy to explain, it may be a good idea.
                     Namespaces are one honking great idea -- let's do more of those!

Friday, November 4, 11

More Related Content

What's hot

Clojure - A new Lisp
Clojure - A new LispClojure - A new Lisp
Clojure - A new Lispelliando dias
 
Intro to Testing in Zope, Plone
Intro to Testing in Zope, PloneIntro to Testing in Zope, Plone
Intro to Testing in Zope, PloneQuintagroup
 
Spock: A Highly Logical Way To Test
Spock: A Highly Logical Way To TestSpock: A Highly Logical Way To Test
Spock: A Highly Logical Way To TestHoward Lewis Ship
 
The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202Mahmoud Samir Fayed
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Kiyotaka Oku
 
Advanced Java Practical File
Advanced Java Practical FileAdvanced Java Practical File
Advanced Java Practical FileSoumya Behera
 
Advanced Python, Part 1
Advanced Python, Part 1Advanced Python, Part 1
Advanced Python, Part 1Zaar Hai
 
The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184Mahmoud Samir Fayed
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Introthnetos
 
Creating Lazy stream in CSharp
Creating Lazy stream in CSharpCreating Lazy stream in CSharp
Creating Lazy stream in CSharpDhaval Dalal
 
Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)Dhaval Dalal
 
Important java programs(collection+file)
Important java programs(collection+file)Important java programs(collection+file)
Important java programs(collection+file)Alok Kumar
 
Kotlin coroutines and spring framework
Kotlin coroutines and spring frameworkKotlin coroutines and spring framework
Kotlin coroutines and spring frameworkSunghyouk Bae
 
NIO.2, the I/O API for the future
NIO.2, the I/O API for the futureNIO.2, the I/O API for the future
NIO.2, the I/O API for the futureMasoud Kalali
 

What's hot (20)

Clojure - A new Lisp
Clojure - A new LispClojure - A new Lisp
Clojure - A new Lisp
 
Biopython
BiopythonBiopython
Biopython
 
Intro to Testing in Zope, Plone
Intro to Testing in Zope, PloneIntro to Testing in Zope, Plone
Intro to Testing in Zope, Plone
 
Spock: A Highly Logical Way To Test
Spock: A Highly Logical Way To TestSpock: A Highly Logical Way To Test
Spock: A Highly Logical Way To Test
 
The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202
 
Python tour
Python tourPython tour
Python tour
 
Java VS Python
Java VS PythonJava VS Python
Java VS Python
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介
 
Advanced Java Practical File
Advanced Java Practical FileAdvanced Java Practical File
Advanced Java Practical File
 
Advanced Python, Part 1
Advanced Python, Part 1Advanced Python, Part 1
Advanced Python, Part 1
 
The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
обзор Python
обзор Pythonобзор Python
обзор Python
 
Creating Lazy stream in CSharp
Creating Lazy stream in CSharpCreating Lazy stream in CSharp
Creating Lazy stream in CSharp
 
Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)
 
Important java programs(collection+file)
Important java programs(collection+file)Important java programs(collection+file)
Important java programs(collection+file)
 
NIO and NIO2
NIO and NIO2NIO and NIO2
NIO and NIO2
 
Sam wd programs
Sam wd programsSam wd programs
Sam wd programs
 
Kotlin coroutines and spring framework
Kotlin coroutines and spring frameworkKotlin coroutines and spring framework
Kotlin coroutines and spring framework
 
NIO.2, the I/O API for the future
NIO.2, the I/O API for the futureNIO.2, the I/O API for the future
NIO.2, the I/O API for the future
 

Viewers also liked

Corpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTKCorpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTKJacob Perkins
 
ZOETWITT in the Press
ZOETWITT in the PressZOETWITT in the Press
ZOETWITT in the Presszoetwitt
 
Basic NLP with Python and NLTK
Basic NLP with Python and NLTKBasic NLP with Python and NLTK
Basic NLP with Python and NLTKFrancesco Bruni
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHugJimmy Lai
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Pythonshanbady
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language ProcessingJaganadh Gopinadhan
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Fasihul Kabir
 

Viewers also liked (9)

Corpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTKCorpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTK
 
NLTK in 20 minutes
NLTK in 20 minutesNLTK in 20 minutes
NLTK in 20 minutes
 
ZOETWITT in the Press
ZOETWITT in the PressZOETWITT in the Press
ZOETWITT in the Press
 
Basic NLP with Python and NLTK
Basic NLP with Python and NLTKBasic NLP with Python and NLTK
Basic NLP with Python and NLTK
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHug
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014
 

Similar to Python & Stuff

Python utan-stodhjul-motorsag
Python utan-stodhjul-motorsagPython utan-stodhjul-motorsag
Python utan-stodhjul-motorsagniklal
 
Introduction to R
Introduction to RIntroduction to R
Introduction to Ragnonchik
 
Python - File operations & Data parsing
Python - File operations & Data parsingPython - File operations & Data parsing
Python - File operations & Data parsingFelix Z. Hoffmann
 
Spock: Test Well and Prosper
Spock: Test Well and ProsperSpock: Test Well and Prosper
Spock: Test Well and ProsperKen Kousen
 
Scala jeff
Scala jeffScala jeff
Scala jeffjeff kit
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingIstanbul Tech Talks
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonCarlos V.
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11Henry Schreiner
 
D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4Jan Berdajs
 
Functions in python
Functions in pythonFunctions in python
Functions in pythonIlian Iliev
 
Using browser() in R
Using browser() in RUsing browser() in R
Using browser() in RLeon Kim
 
Postobjektové programovanie v Ruby
Postobjektové programovanie v RubyPostobjektové programovanie v Ruby
Postobjektové programovanie v RubyJano Suchal
 
Cookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own InterpreterCookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own Interpretermametter
 
Functions and modules in python
Functions and modules in pythonFunctions and modules in python
Functions and modules in pythonKarin Lagesen
 
Python Training v2
Python Training v2Python Training v2
Python Training v2ibaydan
 

Similar to Python & Stuff (20)

Python utan-stodhjul-motorsag
Python utan-stodhjul-motorsagPython utan-stodhjul-motorsag
Python utan-stodhjul-motorsag
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Ruby basics
Ruby basicsRuby basics
Ruby basics
 
Python - File operations & Data parsing
Python - File operations & Data parsingPython - File operations & Data parsing
Python - File operations & Data parsing
 
PythonOOP
PythonOOPPythonOOP
PythonOOP
 
Spock: Test Well and Prosper
Spock: Test Well and ProsperSpock: Test Well and Prosper
Spock: Test Well and Prosper
 
Dynamic Python
Dynamic PythonDynamic Python
Dynamic Python
 
Python and You Series
Python and You SeriesPython and You Series
Python and You Series
 
Scala jeff
Scala jeffScala jeff
Scala jeff
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function Programming
 
Python classes in mumbai
Python classes in mumbaiPython classes in mumbai
Python classes in mumbai
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with Python
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
 
D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4
 
Functions in python
Functions in pythonFunctions in python
Functions in python
 
Using browser() in R
Using browser() in RUsing browser() in R
Using browser() in R
 
Postobjektové programovanie v Ruby
Postobjektové programovanie v RubyPostobjektové programovanie v Ruby
Postobjektové programovanie v Ruby
 
Cookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own InterpreterCookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own Interpreter
 
Functions and modules in python
Functions and modules in pythonFunctions and modules in python
Functions and modules in python
 
Python Training v2
Python Training v2Python Training v2
Python Training v2
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Python & Stuff

  • 1. Python & Stuff All the things I like about Python, plus a bit more. Friday, November 4, 11
  • 2. Jacob Perkins Python Text Processing with NLTK 2.0 Cookbook Co-Founder & CTO @weotta Blog: http://streamhacker.com NLTK Demos: http://text-processing.com @japerk Python user for > 6 years Friday, November 4, 11
  • 3. What I use Python for web development with Django web crawling with Scrapy NLP with NLTK argparse based scripts processing data in Redis & MongoDB Friday, November 4, 11
  • 4. Topics functional programming I/O Object Oriented programming scripting testing remoting parsing package management data storage performance Friday, November 4, 11
  • 5. Functional Programming list comprehensions slicing iterators generators higher order functions decorators default & optional arguments switch/case emulation Friday, November 4, 11
  • 6. List Comprehensions >>> [i for i in range(10) if i % 2] [1, 3, 5, 7, 9] >>> dict([(i, i*2) for i in range(5)]) {0: 0, 1: 2, 2: 4, 3: 6, 4: 8} >>> s = set(range(5)) >>> [i for i in range(10) if i in s] [0, 1, 2, 3, 4] Friday, November 4, 11
  • 7. Slicing >>> range(10)[:5] [0, 1, 2, 3, 4] >>> range(10)[3:5] [3, 4] >>> range(10)[1:5] [1, 2, 3, 4] >>> range(10)[::2] [0, 2, 4, 6, 8] >>> range(10)[-5:-1] [5, 6, 7, 8] Friday, November 4, 11
  • 8. Iterators >>> i = iter([1, 2, 3]) >>> i.next() 1 >>> i.next() 2 >>> i.next() 3 >>> i.next() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration Friday, November 4, 11
  • 9. Generators >>> def gen_ints(n): ... for i in range(n): ... yield i ... >>> g = gen_ints(2) >>> g.next() 0 >>> g.next() 1 >>> g.next() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration Friday, November 4, 11
  • 10. Higher Order Functions >>> def hof(n): ... def addn(i): ... return i + n ... return addn ... >>> f = hof(5) >>> f(3) 8 Friday, November 4, 11
  • 11. Decorators >>> def print_args(f): ... def g(*args, **kwargs): ... print args, kwargs ... return f(*args, **kwargs) ... return g ... >>> @print_args ... def add2(n): ... return n+2 ... >>> add2(5) (5,) {} 7 >>> add2(3) (3,) {} 5 Friday, November 4, 11
  • 12. Default & Optional Args >>> def special_arg(special=None, *args, **kwargs): ... print 'special:', special ... print args ... print kwargs ... >>> special_arg(special='hi') special: hi () {} >>> >>> special_arg('hi') special: hi () {} Friday, November 4, 11
  • 13. switch/case emulation OPTS = { “a”: all, “b”: any } def all_or_any(lst, opt): return OPTS[opt](lst) Friday, November 4, 11
  • 14. Object Oriented classes multiple inheritance special methods collections defaultdict Friday, November 4, 11
  • 15. Classes >>> class A(object): ... def __init__(self): ... self.value = 'a' ... >>> class B(A): ... def __init__(self): ... super(B, self).__init__() ... self.value = 'b' ... >>> a = A() >>> a.value 'a' >>> b = B() >>> b.value 'b' Friday, November 4, 11
  • 16. Multiple Inheritance >>> class B(object): ... def __init__(self): ... self.value = 'b' ... >>> class C(A, B): pass ... >>> C().value 'a' >>> class C(B, A): pass ... >>> C().value 'b' Friday, November 4, 11
  • 17. Special Methods __init__ __len__ __iter__ __contains__ __getitem__ Friday, November 4, 11
  • 18. collections high performance containers Abstract Base Classes Iterable, Sized, Sequence, Set, Mapping multi-inherit from ABC to mix & match implement only a few special methods, get rest for free Friday, November 4, 11
  • 19. defaultdict >>> d = {} >>> d['a'] += 2 Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'a' >>> import collections >>> d = collections.defaultdict(int) >>> d['a'] += 2 >>> d['a'] 2 >>> l = collections.defaultdict(list) >>> l['a'].append(1) >>> l['a'] [1] Friday, November 4, 11
  • 20. I/O context managers file iteration gevent / eventlet Friday, November 4, 11
  • 21. Context Managers >>> with open('myfile', 'w') as f: ... f.write('hellonworld') ... Friday, November 4, 11
  • 22. File Iteration >>> with open('myfile') as f: ... for line in f: ... print line.strip() ... hello world Friday, November 4, 11
  • 23. gevent / eventlet coroutine networking libraries greenlets: “micro-threads” fast event loop monkey-patch standard library http://www.gevent.org/ http://www.eventlet.net/ Friday, November 4, 11
  • 24. Scripting argparse __main__ atexit Friday, November 4, 11
  • 25. argparse import argparse parser = argparse.ArgumentParser(description='Train a NLTK Classifier') parser.add_argument('corpus', help='corpus name/path') parser.add_argument('--no-pickle', action='store_true', default=False, help="don't pickle") parser.add_argument('--trace', default=1, type=int, help='How much trace output you want') args = parser.parse_args() if args.trace: print ‘have args’ Friday, November 4, 11
  • 26. __main__ if __name__ == ‘__main__’: do_main_function() Friday, November 4, 11
  • 27. atexit def goodbye(name, adjective): print 'Goodbye, %s, it was %s to meet you.' % (name, adjective) import atexit atexit.register(goodbye, 'Donny', 'nice') Friday, November 4, 11
  • 28. Testing doctest unittest nose fudge py.test Friday, November 4, 11
  • 29. doctest def fib(n): '''Return the nth fibonacci number. >>> fib(0) 0 >>> fib(1) 1 >>> fib(2) 1 >>> fib(3) 2 >>> fib(4) 3 ''' if n == 0: return 0 elif n == 1: return 1 else: return fib(n - 1) + fib(n - 2) Friday, November 4, 11
  • 30. doctesting modules if __name__ == ‘__main__’: import doctest doctest.testmod() Friday, November 4, 11
  • 31. unittest anything more complicated than function I/O clean state for each test test interactions between components can use mock objects Friday, November 4, 11
  • 32. nose http://readthedocs.org/docs/nose/en/latest/ test runner auto-discovery of tests easy plugin system plugins can generate XML for CI (Jenkins) Friday, November 4, 11
  • 33. fudge http://farmdev.com/projects/fudge/ make fake objects mock thru monkey-patching Friday, November 4, 11
  • 34. py.test http://pytest.org/latest/ similar to nose distributed multi-platform testing Friday, November 4, 11
  • 35. Remoting Libraries Fabric execnet Friday, November 4, 11
  • 36. Fabric http://fabfile.org run commands over ssh great for “push” deployment not parallel yet Friday, November 4, 11
  • 37. fabfile.py from fabric.api import run def host_type(): run('uname -s') fab command $ fab -H localhost,linuxbox host_type [localhost] run: uname -s [localhost] out: Darwin [linuxbox] run: uname -s [linuxbox] out: Linux Friday, November 4, 11
  • 38. execnet http://codespeak.net/execnet/ open python interpreters over ssh spawn local python interpreters shared-nothing model send code & data over channels interact with CPython, Jython, PyPy py.test distributed testing Friday, November 4, 11
  • 39. execnet example >>> import execnet, os >>> gw = execnet.makegateway("ssh=codespeak.net") >>> channel = gw.remote_exec(""" ... import sys, os ... channel.send((sys.platform, sys.version_info, os.getpid())) ... """) >>> platform, version_info, remote_pid = channel.receive() >>> platform 'linux2' >>> version_info (2, 4, 2, 'final', 0) Friday, November 4, 11
  • 40. Parsing regular expressions NLTK SimpleParse Friday, November 4, 11
  • 41. NLTK Tokenization >>> from nltk import tokenize >>> tokenize.word_tokenize("Jacob's presentation") ['Jacob', "'s", 'presentation'] >>> tokenize.wordpunct_tokenize("Jacob's presentation") ['Jacob', "'", 's', 'presentation'] Friday, November 4, 11
  • 42. nltk.grammar CFGs Chapter 9 of NLTK Book: http:// nltk.googlecode.com/svn/trunk/doc/book/ ch09.html Friday, November 4, 11
  • 43. more NLTK stemming part-of-speech tagging chunking classification Friday, November 4, 11
  • 44. SimpleParse http://simpleparse.sourceforge.net/ Parser generator EBNF grammars Based on mxTextTools: http:// www.egenix.com/products/python/mxBase/ mxTextTools/ (C extensions) Friday, November 4, 11
  • 45. Package Management import pip virtualenv mercurial Friday, November 4, 11
  • 46. import import module from module import function, ClassName from module import function as f always make sure package directories have __init__.py Friday, November 4, 11
  • 47. pip http://www.pip-installer.org/en/latest/ easy_install replacement install from requirements files $ pip install simplejson [... progress report ...] Successfully installed simplejson Friday, November 4, 11
  • 48. virtualenv http://www.virtualenv.org/en/latest/ create self-contained python installations dependency silos works great with pip (same author) Friday, November 4, 11
  • 49. mercurial http://mercurial.selenic.com/ Python based DVCS simple & fast easy cloning works with Bitbucket, Github, Googlecode Friday, November 4, 11
  • 50. Flexible Data Storage Redis MongoDB Friday, November 4, 11
  • 51. Redis in-memory key-value storage server most operations O(1) lists sets sorted sets hash objects Friday, November 4, 11
  • 52. MongoDB memory mapped document storage arbitrary document fields nested documents index on multiple fields easier (for programmers) than SQL capped collections (good for logging) Friday, November 4, 11
  • 53. Python Performance CPU RAM Friday, November 4, 11
  • 54. CPU probably fast enough if I/O or DB bound try PyPy: http://pypy.org/ use CPython optimized libraries like numpy write a CPython extension Friday, November 4, 11
  • 55. RAM don’t keep references longer than needed iterate over data aggregate to an optimized DB Friday, November 4, 11
  • 56. import this >>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those! Friday, November 4, 11