6. • There are already many frameworks for Git
• Generally, one for each programming language
• Java -> JGit
• Python -> GitPython
• Javascript -> nodegit
• etc.
11. How many commands does Git have?
• > 20?
• > 50?
• > 100?
• > 150?
154!!
12. PyDriller
• Aim: to ease the extraction of information from Git repositories
• What is supported:
• analysing the history of a project
• retrieving commit information (date, message, authors, etc.)
• retrieving files information (diff, source code)
• What is not supported:
• writing on the repo (git pull, git push, git add, git commit,
etc..)
14. Statistics
• Everything is lazy evaluated, so you “pay” what you get.
1. only commit information:
immediate (as git log)
2. commit and file information:
60 commits/sec (1240 commits in 22 seconds)
3. commit, file and metrics information:
4 commits/s (1240 commits in ~5min)
15. Thank you for your support!
• Some numbers:
1. Downloaded approximatively 4000 times
2. 100 times only last 2 weeks
• Community driven
• University of Zurich, TU Delft and University of Catania teach
PyDriller in their MSR courses
• SIG uses PyDriller in their quality assessments
16. What’s next?
• A company asked me to implement
RepositoryMining().traverse_files()
• Mercurial support
• Ideas? Talk to me or submit a PR :)
17. PyDriller
• Source code: https://github.com/ishepard/pydriller
• Doc: https://pydriller.readthedocs.io/en/latest/
• Feel free to leave a star! :)