Python Style Guide for BaBar
Table of Contents
Introduction
Code Lay-out
Imports
Whitespace in Expressions and Statements
Comments
Documentation Strings
Version Bookkeeping
Naming Conventions
Guido's Programming Recommendations
BaBar Programming Recommendations
References
This document is loosely based on the "official" Python style guide [1] written by the Python's author and BDFL [2]. We tried to make our style guide not too long so that it could serve as a concise reference, hence many details or explanations appeared in original PEP are missing. You may benefit from reading original document if you need more arguments for accepting these guidelines.
Obligatory quote from PEP8:
A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is most important.
Where possible we tried to make this style guide consistent with the BaBar guidelines for C++ and other languages [4].
Indentation
Use the default of Emacs' Python-mode: 4 spaces for one indentation level.
Tabs or Spaces?
Original guidelines preferred using spaces over tabs, in BaBar spaces are the only option. Check BaBar Python pages to see how to adjust your editor settings and disable tabs.
Maximum Line Length
It is preferrable to limit all lines to a maximum of 79 characters. For flowing long blocks of text (docstrings or comments), limiting the length to 72 characters is recommended.
Line Continuation
The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. If necessary, you can add an extra pair of parentheses around an expression, but sometimes using a backslash looks better. Make sure to indent the continued line appropriately. Some examples:
class Rectangle(Blob):
def __init__(self, width, height,
color='black', emphasis=None, highlight=0):
if width == 0 and height == 0 and \
color == 'red' and emphasis == 'strong' or \
highlight > 100:
raise ValueError, "sorry, you lose"
if width == 0 and height == 0 and (color == 'red' or
emphasis is None):
raise ValueError, "I don't think so"
Blob.__init__(self, width, height,
color, emphasis, highlight)
Blank Lines
Blank lines are not syntactic elements in Python, use them to your taste to make your code readable. Read original gudelines to learn Guido's opinion about how blank lines should be used.
Encodings
Python encoding cookies are not used in BaBar, hence the only accepted encoding in the code is ASCII. Docstrings or comments may include Latin-1 characters.
Ordering Things
Within a module the things must be placed in the following order:
- Shebang line (#!@PYTHON@), only for executable scripts
- Module-level comments
- Module-level docstring
- Imports
- Module variables (names start with underscore)
- Module functions and classes (names start with underscore)
- Public variables
- Public functions and classes
- Optional test suits
New code should be written using templates from the CodeTemplates package.
- Imports should usually be on separate lines, e.g.:
No: import sys, os
Yes: import sys
import os
it's okay to say this though:
from types import StringType, ListType
- Imports are always put at the top of the file, just after any
module comments and docstrings, and before module globals and constants.
Imports should be grouped, with the order being
- standard library imports
- imports for the base classes, if any
- all other imports
You should put a blank line between each group of imports.
- Relative imports for intra-package imports are highly discouraged.
- Always use the absolute package path for all imports.
- When importing a
class from a class-containing module, it's usually okay to spell this
from MyClass import MyClass
from foo.bar.YourClass import YourClass
If this spelling causes local name clashes, then spell them
import MyClass
import foo.bar.YourClass
and use "MyClass.MyClass" and "foo.bar.YourClass.YourClass"
Original style guide contains a long list of places where Guido hates whitespaces. We find these recommendations too personal and do not include it here. Our own recommendation is to use whitespaces where it helps readability.
We agree wholeheartedly with the following:
Compound statements (multiple statements on the same line) are
generally discouraged.
No: if foo == 'blah': do_blah_thing()
Yes: if foo == 'blah':
do_blah_thing()
No: do_one(); do_two(); do_three()
Yes: do_one()
do_two()
do_three()
This section copied completely from Guido's style guide.
Comments that contradict the code are worse than no comments. Always make a priority of keeping the comments up-to-date when the code changes!
Comments should be complete sentences. If a comment is a phrase or sentence, its first word should be capitalized, unless it is an identifier that begins with a lower case letter (never alter the case of identifiers!).
If a comment is short, the period at the end is best omitted. Block comments generally consist of one or more paragraphs built out of complete sentences, and each sentence should end in a period.
You should use two spaces after a sentence-ending period, since it makes Emacs wrapping and filling work consistenty.
When writing English, Strunk and White apply.
Python coders from non-English speaking countries: please write your comments in English, unless you are 120% sure that the code will never be read by people who don't speak your language.
Block Comments
Block comments generally apply to some (or all) code that follows them, and are indented to the same level as that code. Each line of a block comment starts with a # and a single space (unless it is indented text inside the comment). Paragraphs inside a block comment are separated by a line containing a single #. Block comments are best surrounded by a blank line above and below them (or two lines above and a single line below for a block comment at the start of a a new section of function definitions).
Inline Comments
An inline comment is a comment on the same line as a statement. Inline comments should be used sparingly. Inline comments should be separated by at least two spaces from the statement. They should start with a # and a single space.
Inline comments are unnecessary and in fact distracting if they state
the obvious. Don't do this:
x = x+1 # Increment x
But sometimes, this is useful:
x = x+1 # Compensate for border
Conventions for writing good documentation strings (a.k.a. "docstrings") are immortalized in PEP 257 [3].
In BaBar we can use our doc-generation tools to produce source documentation (provided that our doc-generation project will ever take off.) To be useful for Doxygen, the docstrings should be formatted according to Doxygen rules, just like C++ code comments are formatted. Authors should follow these rules when documenting the code with docstrings. Code templates show examples of the doctrings formatted for Doxygen.
If you have to have RCS or CVS crud in your source file, do it as
follows.
__version__ = "$Revision: 1.18 $"
# $Source: /cvsroot/python/python/nondist/peps/pep-0008.txt,v $
These lines should be included after the module's docstring,
before any other code, separated by a blank line above and
below.
Python naming conventions should follow when possible BaBar conventions established for C++ [4]. In fact BaBar conventions are not defined in some cases, so we still have some freedom.
General Remarks
- Names with double leading and trailing underscores are "magic" names (e.g. __init__, __name__, or __str__). Users should not use this form to define "user names".
- Names with leading double underscores (but without trailing double underscores) define class-private names if they appear inside class.
- Single leading underscore - weak "internal use" indicator, e.g. "from M import *" does not import names starting with an underscore.
Class Names
Python class names should follow the same conventions as C++ class names - they should be CapitalizedWords and have the same Tla prefix as the package they live in. Typically there should be one class in one source file. As an exception you can put one or more "hidden" classes (whose names start with underscore) along with the normal class if these hidden classes names are never exposed to the outside world.
Method and Attribute Names
Class and object methods and attributes should be made mixedCase, with lowercase initial letter. To make attribute or method private prefix it with double underscore. (There is no reason to prefix member variables with single underscore in Python, because they are always referenced through 'self'.)
Module methods (free functions) can be mixedCase or lowercase_with_underscores, normally they should be prefixed with the lower case package TLA. Modules should not normally expose their variables directly, except the variables which are "constants". Constants should be named in UPPERCASE_WITH_UNDERSCORES. Module variables of functions which should not be exposed start with underscore.
Module Names
Module names are the same as the file names (minus .py extension). The modules which contain class definitions should be named after the class name (one module per class). Modules containing only the functions should be named in lowercase_with_underscores, prefixed with the lowercase package TLA.
This section is copied entirely from PEP 8.
- Comparisons to singletons like None should always be done with
'is' or 'is not'. Also, beware of writing "if x" when you
really mean "if x is not None" -- e.g. when testing whether a
variable or argument that defaults to None was set to some other
value. The other value might be a value that's false in a
Boolean context!
- Class-based exceptions are always preferred over string-based
exceptions. Modules or packages should define their own
domain-specific base exception class, which should be subclassed
from the built-in Exception class. Always include a class
docstring. E.g.:
class MessageError(Exception):
"""Base class for errors in the email package."""
- Use string methods instead of the string module unless
backward-compatibility with versions earlier than Python 2.0 is
important. String methods are always much faster and share the
same API with unicode strings.
- Avoid slicing strings when checking for prefixes or suffixes.
Use startswith() and endswith() instead, since they are faster,
cleaner and less error prone. E.g.:
No: if foo[:3] == 'bar':
Yes: if foo.startswith('bar'):
The exception is if your code must work with Python 1.5.2 (but
let's hope not!).
- Object type comparisons should always use isinstance() instead
of comparing types directly. E.g.
No: if type(obj) is type(1):
Yes: if isinstance(obj, int):
When checking if an object is a string, keep in mind that it
might be a unicode string too! In Python 2.3, str and unicode
have a common base class, basestring, so you can do:
if isinstance(obj, basestring):
In Python 2.2, the types module has the StringTypes type defined
for that purpose, e.g.:
from types import StringTypes
if isinstance(obj, StringTypes):
In Python 2.0 and 2.1, you should do:
from types import StringType, UnicodeType
if isinstance(obj, StringType) or \
isinstance(obj, UnicodeType) :
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false, so "if not seq" or "if seq" is preferable
to "if len(seq)" or "if not len(seq)".
- Don't write string literals that rely on significant trailing
whitespace. Such trailing whitespace is visually
indistinguishable and some editors (or more recently,
reindent.py) will trim them.
- Don't compare boolean values to True or False using == (bool
types are new in Python 2.3):
No: if greeting == True:
Yes: if greeting:
Code Templates
For new code always use template files from CodeTemplates package. There are two templates, one for "library" modules, another for executable scripts.
Interpreter Version Checking
If you write the code which uses features of the latest Python releases, you need to insert version checking code into your module:
from BaBar.BbrPython import BbrPyVersion
BbrPyVersion.require("2.3")
Template modules have these lines commented out, uncomment them if you need them, but do not
delete if you do not need them now.
Learn Python Object Model
Python object model is simple yet powerful. For people who do not understand it following piece of code may look amusing:
>>> a = [0]
>>> b = [a,a,a]
>>> print b
[[0], [0], [0]]
>>> b[0][0] = 1
>>> print b
[[1], [1], [1]]
But for people who understand object model, it does make a lot of sense:)
- PEP8, Style
Guide for Python Code. Guido van Rossum, Barry Warsaw
- BDFL - Benevolent Dictator for Life
- PEP 257, Docstring Conventions. David Goodger, Guido van Rossum
- Offline Coding Recommendations and Standards
| Last significant update: 2003-06-28 |
Expiry date: 2005-01-01 |
|