Takayuki Shimizukawa discusses how to generate documentation from Python source code using Sphinx. He introduces Sphinx and its extensions for automating documentation generation from docstrings. He demonstrates setting up a Sphinx project and configuring extensions like autodoc, autosummary, and doctest to generate API documentation and test code examples. The presentation emphasizes best practices for writing informative docstrings and code examples to fully document modules and functions.
6. 1. def dumps(obj, ensure_ascii=True):
2. """Serialize ``obj`` to a JSON formatted
``str``.
3. """
4.
5. ...
6. return ...
Line 2,3 is a docstring
You can see the string by "help(dumps)"
Docstring
6
8. What is the reason you do not write docstrings.
I don't know what/where should I write.
Are there some docstring format spec?
It's not beneficial.
I'll tell you a good way to write the docstrings.
8
9. Goal of this session
How to generate a doc from Python source
code.
re-discovering the meaning of docstrings.
9
11. What is Sphinx?
11
Sphinx is a documentation generator
Sphinx generates doc as several output
formats from the reST text markup
Sphinx
reSTreSTreStructuredText
(reST) reST Parser
HTML Builder
ePub Builder
LaTeX Builder texlive
HTML
theme
Favorite Editor
12. The history of Sphinx (short ver.)
12
The father of
Sphinx
Too hard to
maintenance
~2007
Easy to write
Easy to maintenance
2007~
13. Sphinx Before and After
Before
There was no standard ways to write documents
Sometime, we need converting markups into other formats
After
Generate multiple output format from single source
Integrated html themes make read docs easier
API references can be integrated into narrative docs
Automated doc building and hosting by ReadTheDocs
13
14. Many docs are written by Sphinx
For examples
Python libraries/tools:
Python, Sphinx, Flask, Jinja2, Django, Pyramid,
SQLAlchemy, Numpy, SciPy, scikit-learn,
pandas, fabric, ansible, awscli, …
Non python libraries/tools:
Chef, CakePHP(2.x), MathJax, Selenium,
Varnish
14
15. Many docs are written by Sphinx
For examples
Python libraries/tools:
Python, Sphinx, Flask, Jinja2, Django, Pyramid,
SQLAlchemy, Numpy, SciPy, scikit-learn,
pandas, fabric, ansible, awscli, …
Non python libraries/tools:
Chef, CakePHP(2.x), MathJax, Selenium,
Varnish
15
16. Sphinx extensions (built-in)
Sphinx provides these extensions to support
automated API documentation.
sphinx.ext.autodoc
sphinx.ext.autosummary
sphinx.ext.doctest
sphinx.ext.coverage
Sphinx
reST Parser
HTML Builder
ePub Builder
LaTeX Builder
docutils
autosummary
autodoc
doctest
coverage
16
18. $ pip install sphinx
Your code and sphinx should be in single
python environment.
Python version is also important.
How to install Sphinx
18
19. $ cd /path/to/your-code
$ sphinx-quickstart doc -m
...
Project name: Deep thought
Author name(s): Mice
Project version: 0.7.5
...
...
Finished
"-m" to generate minimal Makefile/make.bat
-m is important to introduce this session easily.
How to start a Sphinx project
Keep pressing ENTER key
19
Create a doc directory
20. $ cd doc
$ make html
...
Build finished. The HTML pages are in _build/html.
"make html" command
generates html files into
_build/html.
make html once
20
23. $ tree /path/to/your-code
+- deep_thought
| +- __init__.py
| +- api.py
| +- calc.py
| +- utils.py
+- doc
| +- _build/
| | +- html/
| +- _static/
| +- _template/
| +- conf.py
| +- index.rst
| +- make.bat
| +- Makefile
+- setup.py
1. import os
2. import sys
3. sys.path.insert(0, os.path.abspath('..'))
4. extensions = [
5. 'sphinx.ext.autodoc',
6. ]
setup autodoc extension
doc/conf.py
23
Line-3: add your library path to
import them from Sphinx autodoc.
Line-5: add 'sphinx.ext.autodoc' to
use the extension.
24. Add automodule directive to your doc
1. Deep thought API
2. ================
3.
4. .. automodule:: deep_thought.utils
5. :members:
6.
1. "utility functions"
2.
3. def dumps(obj, ensure_ascii=True):
4. """Serialize ``obj`` to a JSON formatted ``str``.
5. """
6. ...
doc/index.rst
24
deep_thought/utils.py
Line-4: automodule directive import specified
module and inspect the module.
Line-5: :members: option will inspects all members
of module not only module docstring.
25. $ cd doc
$ make html
...
Build finished. The HTML pages are in _build/html.
make html
25
26. How does it work?
autodoc directive generates intermediate reST
internally:
1. Deep thought API
2. ================
3.
4. .. py:module:: deep_thought.utils
5.
6. utility functions
7.
8. .. py:function:: dumps(obj, ensure_ascii=True)
9. :module: deep_thought.utils
10.
11. Serialize ``obj`` to a JSON formatted :class:`str`.
doc/index.rst
Intermediate
reST
26
27. $ make html SPHINXOPTS=-vvv
...
...
[autodoc] output:
.. py:module:: deep_thought.utils
utility functions
.. py:function:: dumps(obj, ensure_ascii=True)
:module: deep_thought.utils
Serialize ``obj`` to a JSON formatted :class:`str`.
You can see the reST with -vvv option
27
28. Take care!
Sphinx autodoc import your code
to get docstrings.
It means autodoc will execute code
at module global level.
28
29. Danger code
1. import os
2.
3. def delete_world():
4. os.system('sudo rm -Rf /')
5.
6. delete_world() # will be executed at "make html"
danger.py
29
30. execution guard on import
1. import os
2.
3. def delete_world():
4. os.system('sudo rm -Rf /')
5.
6. delete_world() # will be executed at "make html"
danger.py
1. import os
2.
3. def delete_world():
4. os.system('sudo rm -Rf /')
5.
6. if __name__ == '__main__':
7. delete_world() # doesn't execute at "make html"
safer.py
Execution guard
30
31. execution guard on import
1. import os
2.
3. def delete_world():
4. os.system('sudo rm -Rf /')
5.
6. delete_world() # will be executed at "make html"
danger.py
1. import os
2.
3. def delete_world():
4. os.system('sudo rm -Rf /')
5.
6. if __name__ == '__main__':
7. delete_world() # doesn't execute at "make html"
safer.py
Execution guard
31
32. "Oh, I can't understand the type of arguments
and meanings even reading this!"
32
Lacking necessary information
33. 1. def dumps(obj, ensure_ascii=True):
2. """Serialize ``obj`` to a JSON formatted
3. :class:`str`.
4.
5. :param dict obj: dict type object to serialize.
6. :param bool ensure_ascii: Default is True. If
7. False, all non-ASCII characters are not ...
8. :return: JSON formatted string
9. :rtype: str
10. """
http://sphinx-doc.org/domains.html#info-field-lists
"info field lists" for arguments
deep_thought/utils.py
33
34. def dumps(obj, ensure_ascii=True):
"""Serialize ``obj`` to a JSON formatted :class:`str`.
:param dict obj: dict type object to serialize.
:param bool ensure_ascii: Default is True. If
False, all non-ASCII characters are not ...
:return: JSON formatted string
:rtype: str
"""
...
"info field lists" for arguments
deep_thought/utils.py
34
35. Cross-reference to functions
1. Examples
2. ==========
3.
4. This is a usage of :func:`deep_thought.utils.dumps`
blah blah blah. ...
examples.py
reference
(hyper link)
35
37. Code example in a docstring
1. def dumps(obj, ensure_ascii=True):
2. """Serialize ``obj`` to a JSON formatted
3. :class:`str`.
4.
5. For example:
6.
7. >>> from deep_thought.utils import dumps
8. >>> data = dict(spam=1, ham='egg')
9. >>> dumps(data)
10. '{spam: 1, ham: "egg"}'
11.
12. :param dict obj: dict type object to serialize.
13. :param bool ensure_ascii: Default is True. If
14. False, all non-ASCII characters are not ...
deep_thought/utils.py
37
doctest
block
You can copy & paste the red lines
from python interactive shell.
39. You can expect that developers will update code
examples when the interface is changed.
We expect ...
1. def dumps(obj, ensure_ascii=True):
2. """Serialize ``obj`` to a JSON formatted
3. :class:`str`.
4.
5. For example:
6.
7. >>> from deep_thought.utils import dumps
8. >>> data = dict(spam=1, ham='egg')
9. >>> dumps(data)
10. '{spam: 1, ham: "egg"}'
The code example is
very close from
implementation!!
deep_thought/utils.py
39
43. $ make doctest
...
Document: api
-------------
********************************************************
File "api.rst", line 11, in default
Failed example:
dumps(data)
Expected:
'{spam: 1, ham: "egg"}'
Got:
'to-be-implemented'
...
make: *** [doctest] Error 1
Result of "make doctest"
43
Result of doctest
56. make coverage and check the result
$ make coverage
...
Testing of coverage in the sources finished, look at the
results in _buildcoverage.
$ ls _build/coverage
c.txt python.txt undoc.pickle
1. Undocumented Python objects
2. ===========================
3. deep_thought.utils
4. ------------------
5. Functions:
6. * egg
_build/coverage/python.txt
This function doesn't have a doc!
56
57. CAUTION!
1. Undocumented Python objects
2. ===========================
3. deep_thought.utils
4. ------------------
5. Functions:
6. * egg
python.txt
$ make coverage
...
Testing of coverage in the sources finished, look at the
results in _buildcoverage.
$ ls _build/coverage
c.txt python.txt undoc.pickle
The command always return ZERO
coverage.xml is not exist
reST format for whom?
57
60. Why don't you write docstrings?
I don't know what/where should I write.
Let's write a description, arguments and doctest blocks
at the next line of function signature.
Are there some docstring format spec?
Yes, you can use "info field list" for argument spec and
you can use doctest block for code example.
It's not beneficial.
You can use autodoc, autosummary, doctest and
coverage to make it beneficial.
60
63. Options for autodoc
:members: blah
To document just specified members. Empty is ALL.
:undoc-members: ...
To document members which doesn't have docstring.
:private-members: ...
To document private members which name starts with
underscore.
:special-members: ...
To document starts with underscore underscore.
:inherited-members: ...
To document inherited from super class.
63
67. Translation into other languages
$ make gettext
...
Build finished. The message catalogs are in
_build/gettext.
$ sphinx-intl update -p _build/gettext -l es
#: ../../../deep_thought/utils.pydocstring of deep_thought.
msgid "Serialize ``obj`` to a JSON formatted :class:`str`."
msgstr "Serializar ``obj`` a un formato JSON :class:`str`."
msgid "For example:"
msgstr "Por ejemplo:"
locale/es/LC_MESSAGES/generated.po
language = 'es'
locale_dirs = ['locale']
conf.py
$ make html
...
Build finished. The HTML pages are in _build/html.
67
Hi everyone. Thank you for coming my session.
This session title is: Sphinx autodoc – automated API documentation
At first, Let me introduce myself.
My name is Takayuki Shimizukawa came from Japan.
I do 3 opensource works.
1. Sphinx co-maintainer since the end of 2011.
2. organize Sphinx-users.jp users group in Japan.
3. member of PyCon JP Committee.
And I'm working for BePROUD.Our company develop web applications for business customers with using Django, Pyramid, SQLAlchemy, Sphinx and other python related tools.
Before my main presentation, I'd like to introduce "PyCon JP 2015" in Tokyo Japan.
We will held the event in this October.
Registration is opened. Please join us.
Anyway.
Sphinx autodoc. This is a main topic of this session.
Autodoc is a feature that is automatic document generation from source code.
Autodoc uses the function definitions and also uses docstring of the such functions.
Before we jump in the main topic, I want to know how many people know the docstring, and how many people writing the docstring.
Docstring is a feature of Python.
Do you know the docstring? How many people know that?
Please raise your hand.
10, 20, 30.. 55 hands. Thanks.
Hum, It might be a minor feature of Python.
OK, This red lines is a docstring.
Docstring describe a way of using the function that is written at the first line of the function body.
When you type "help(dumps)" in a Python interactive shell, you will get the docstring.
Have you written API docs as docstrings?
Please raise again.
10, 20.. 22.5 hands.
Thanks.
It's very small number of hands.
But some people write a docstrings.
So, what is the reason you do not write them?
Someone would say,
* I don't know what/where should I write them.
* Are there some specific docstring formats?
* It's not beneficial.
For example, sometimes docstrings are not updated even the function's behavior is changed.
Those opinions are understandable
So then, I'll explain you how to write the docstrings.
Goal of this session.
First one is, * How to generate a doc from Python source code.
Second one is, * re-discovering the meaning of docstrings.
OK, let's move forward.
Sphinx autodoc is the most useful way to activate docstrings.
So, before talking about docstrings, I'll introduce a basic of sphinx and How to Setup it.
What is Sphinx?
Sphinx is a documentation generator.
Sphinx generates doc as several output formats from reStructuredText markup that is an extensible.
(ポインタでinputとoutputを指す)
The history of Sphinx.
This man, Georg Brandl is the father of Sphinx.
(クリック)
Until 2007, python official document was written by LaTeX.
But, it's too hard to maintenance.
Georg was trying to change such situation.
(クリック)
So then, he created the Sphinx in 2007.
The sphinx provides ease of use and maintainability for the Python official document.
Sphinx before and after.
Before
There was no standard ways to write documents. One of example is a Python official document. It was written by LaTeX and several some python scripts jungle.
And, Sometime, we need converting markups into other formats
Since sphinx has been released,
* We can generate more multiple output format from single source.
* Integrated html themes make read docs easier.
* API references can be integrated into narrative docs.
* Automated doc building and hosting by ReadTheDocs service.
Nowadays, sphinx has been used by these libraries and tools.
Python libraries/tools: Python, Sphinx, Flask, Jinja2, Django, Pyramid, SQLAlchemy, Numpy, SciPy, scikit-learn, pandas, fabric, ansible, awscli, …
And Non python library/tools also using Sphinx for them docs: Chef, CakePHP(2.x), MathJax, Selenium, Varnish
Nowadays, sphinx has been used by these libraries and tools.
Python libraries/tools: Python, Sphinx, Flask, Jinja2, Django, Pyramid, SQLAlchemy, Numpy, SciPy, scikit-learn, pandas, fabric, ansible, awscli, …
And Non python library/tools also using Sphinx for them docs: Chef, CakePHP(2.x), MathJax, Selenium, Varnish
Sphinx provides these extensions to support automated API documentation.
sphinx.ext.autodoc
sphinx.ext.autosummary
sphinx.ext.doctest
sphinx.ext.coverage
Autodoc is the most important feature of sphinx.
Almost python related libraries are using the autodoc feature.
OK, let's setup a sphinx project for this code, for example.
This library will be used in place of your code to explain autodoc feature.
The library name is "Deep Thought".
This is a structure of the library.
The library has three modules: api.py, calc.py and utils.py.
Second box is a first lines of program code in utils.py.
If you don’t have sphinx installation in your environment, you need to install the Sphinx by this command.
pip install sphinx
Please note that your source code and sphinx should be installed in single python environment.
Python version is also important. If you install Sphinx into Python3 environment in spite of your code is written in Python2, autodoc will emit exception to import your Python2 source code.
Once you installed the sphinx, you can generate your documentation scaffold by using "sphinx-quickstart" command.
Then interactive wizard is invoked and it requires Project name, Author name and Project version.
The wizard also ask you many questions, but, DON'T PANIC, Usually, all you need is keep pressing Enter key.
Note that, -m option is important.
If you invoke without the option, you will get a "hard-coded make targets" Makefile, that will annoy you. And my presentation slide stand on this option.
This option is introduced since Sphinx-1.3.
And -m option will become default from Spihnx-1.5.
So, type "make html" in doc directory to generate html output.
You can see the output in _build/html directory.
Now you can see the directories/files structure, like this.
Library files under deep_thought directory.
Build output under doc directory.
Scaffold files under doc directory.
In particular, you will see well utils.py, conf.py and index.rst in this session.
Now we ready to go.
Generate API docs from your python source code.
Setup sphinx autodoc extension.
This is a conf.py file in your sphinx scaffold.
What is important is the third and fifth lines.
Line-3rd: add your library path to import them from Sphinx autodoc.
Line-5th: add 'sphinx.ext.autodoc' to use the extension.
Next, let's specify the modules you want to document.
Add automodule directive to your doc.
First box is a utils.py file that is a part of deep_thought example library.
Second box is a reST file. You can see the automodule usage in this box.
automodule is a sphinx directive syntax that is provided by autodoc extension to generate document.
Let's see the second box.
(クリック)
Line-4th: automodule directive imports specified module and inspect it.
In this case, deepthought.utils module will be imported and be inspected.
Line-5th: :members: option will inspects all members of module not only just module docstring.
OK, we are now all ready. Let's invoke "make html" again.
So, as a result of "make html", you can get a automatically generated document from .py file.
Internally, automodule directive inspects your module and render the function signature, arguments and docstring.
How does it work?
autodoc directive generates intermediate reST, like this.
Actually intermediate file is not generated in your filesystem, just created it on memory.
If you want to see the intermediate reST lines, you can use -vvv option, like this.
As you see, automodule directive is replaced with concrete documentation contents.
But, please take care.
Sphinx autodoc import your code to get docstrings.
It means autodoc will execute code at module global level.
Let me introduce a bad case related to this.
This module will remove all your files.
danger.py was designed as command-line script instead of "import from other module".
If you tried to document with using autodoc, delete_world function will be called.
Consequence of this, "make html" will destroy all your files.
On the other hand, safer.py (lower block) using execution guard.
It's very famous python's idiom.
Because of the execution guard, your files will not be removed by make html.
As a practical matter, you shouldn't try to document your setup.py for your package with using autodoc.
Now let's return to the docstring and its output.
This output lacks necessary information.
It is the information of the argument.
If you are looking for the API reference, and you find it, you will say;
"Oh, I can't understand the type of arguments and meanings even reading this!"
In this case, you can use "info field list" syntax for describing arguments.
A real docstring should have descriptions for each function arguments like this.
These red parts are special version of "field list" that called "info field lists".
The specification of info field lists is described at the URL page.
Info field lists is rendered as this.
The output is quite nice.
So, you will say; "Oh, I can understand it!", maybe.
Cross-reference to functions.
You can easily make cross-reference from other location to the dumps function.
Of course, the cross-references beyond the pages.
So far, I introduced the basics of autodoc.
Following subject is: Detect deviations of the implementation and document.
By using doctest.
I think good API has a good document that illustrate usages of the API by using code example.
If doc has code example, you can grasp the API usage quickly and exactly.
I add a code example, such 4 red lines to docstring earlier.
(クリック)
It's called "doctest block".
Obviously, this look is an interactive content of the python interactive shell.
Actually, you can copy & paste the red lines from python interactive shell.
After make html,
You can get a syntax highlighted doctest block, like this.
Library users can grasp the API usage quickly and exactly.
And also users can try out it easily.
And the point of view from library developers,
code example is very close from implementation!
We can expect that library developers will update code examples when the interface is changed by themselves.
... Really?
Sorry, I don't believe it.
If the code examples was very close from implementation, developers wouldn't mind to it.
Because developers have no spare time to read the implicit rules from the code.
Explicit is better than implicit for us.
OK, let's use the doctest builder to detect deviations of the implementation and documentation.
To use doctest builder, you need to add sphinx doctest extension to conf.py, like this.
Line-5: add 'sphinx.ext.doctest'
With only this, you are ready to use the doctest builder.
OK, Let's invoke "make doctest" command.
(OK, Let's invoke "make doctest" command.)
After that, you can see the dumps function will provide us different result from the expected one.
Expected one is: '{spam: 1, ham: "egg"}'
Actual one is: 'to-be-implemented'
it is not implemented properly yet.
Anyway, by using the doctest builder, it show us differences in implementation and sample code in the documentation.
Actually, if your UnitTest also contains doctests, you don't need to do this by Sphinx.
However, if you don't write the UnitTest, "make doctest" would be a good place to start.
Listing APIs automatically with using autosummary.
As already noted, autodoc is very useful.
However, if you have a lot of function of a lot of modules, ...
And You want to have individual pages for each modules, you need to prepare many reST files for each modules.
(クリック)
This box is for utils.py.
In this case you should also prepare such .rst files for api module and calc module.
If you have 100 modules, you should prepare 100 .rst files.
As you see, each reST files have just 4 lines.
You can get them by repeating copy & paste & modify bit.
However ... I believe that you don't want to repeat that, like this.
Don't Repeat Yourself.
OK, let's use the autosummary extension to avoid such a boring tasks.
Setup sphinx autosummary extension.
This is your conf.py again.
Line-6th: to add 'sphinx.ext.autosummary' to use the extension.
Line-8th: to use 'members' option for each autodoc related directives.
Line-9th: to generate reST files what you will specify with using autosummary directive.
//メモ: 9th to invoke 'sphinx-apidoc' internally. Default is False, in that case, you need to invoke 'sphinx-apidoc' by hand.
You can use autosummary directive in your reST files as you see.
This sample uses autosummary directive and toctree option.
The :toctree: option is a directory location of intermediate files that will be generated by autosummary.
And contents of autosummary directive, deep_thought.api, calc and utils, are module names you want to document.
Thereby the autosummary, you will get 100 intermediate .rst files if you have 100 modules.
After run "make html" command again.
Finally, you can get each documented pages without forcing troublesome simple operations.
Additionally, "autosummary" directive you wrote was generating table of contents that linking each module pages.
Discovering undocumented APIs by using coverage extension.
So far, we've automated the autodoc by using autosummary.
In addition, now you can also find deviation of documents and implementation by using the doctest.
But, how do you find a function that is not writing a docstring, at all?
For such situation, we can use coverage extension to find undocumented functions, classes, methods.
To use coverage extension, you should setup coverage extension to conf.py.
This is your conf.py again.
Line-7th: to add 'sphinx.ext.coverage' to use the extension.
That's all!
Let's invoke "make coverage" command.
After that, you can get a result of coverage measurement.
The coverage report is recorded in "_build/coverage/python.txt" that contains undocumented functions, classes and modules.
As you see, you can get the undocumented function name.
However, please take care that;
Command always return 0
Then you can't distinguish the presence or absence of the undocumented function by the return code.
IMO, it's fair enough because coverage command shouldn't fail regardless whether coverage is perfect or not.
However, unfortunately, "make coverage" also unsupported to generate coverage.xml for Jenkins or some CI tools.
As conclusion of this, you can discover the undocumented functions, but you can't integrate the information to a CI tools.
Sorry for inconvenience.
And we are waiting for your contribution to solve the problem.(bow)
Let's review the reasons for not writing a docstring that was introduced at the beginning.
I don't know what/where should I write.
Let's write a description, arguments and doctest blocks at the next line of function signature.
Are there some docstring format spec?
Yes, you can use "info field list" for argument spec and you can use doctest block for code example.
It's not beneficial.
You can use autodoc, autosummary, doctest and coverage to make it beneficial.
I think these reasons are resolved by using sphinx autodoc features, aren't you?
Let's write docstrings, and use autodoc!
At the end, I'd like to introduce some of the tips.
First one is, Options.
Options for autodo.
:members: blah
To document just specified members. If you specify the option without parameter, it means ALL.
:undoc-members: ... To document members which doesn't have docstring. If you specify the option without parameter, all undocumented members are rendered.
:private-members: ... To document private members which name starts with underscore.:special-members: ... To document starts with underscore underscore.
:inherited-members: ... To document inherited from super class.
Please refer to sphinx reference for the detail of options.
Second one is Directives for Web API.
sphinxcontrib-httpdomain 3rd-party extension provides http domain to generate WebAPI doc.
As you see, you can use get directive.
Httpdomain also provides:
Other http related directives
"http" syntax highlighter
It generates nice WebAPI reference page and well organized WebAPI index page.
Httpdmain also contains sphinxcontrib.autohttp extension that support Flask, Bottle and Tornado WAF to document WebAPI methods automatically by using reflection.
The last one is Document translation.
You can get translated output w/o editing reST and python code.
For that, you can use "make gettext" command that generates gettext style pot files.
"make gettext" extract text from reST file and python source file that referenced by autodoc.
It means, you can translate them into any language without rewriting the original reST files and python source files.
If you are interested, please search my PyCon APAC session slide on SlideShare.