Tutorials
python

Docstrings in Python

Get introduced to Docstrings in Python. Learn more about the different types of writing Docstrings, such as One-line Docstrings and Multi-line Docstrings, popular Docstring formats with their uses. Along with the built-in Docstrings.

The rules that are mentioned below are only a convention, and also they are strictly based on the PEP standard. They are not rules or a Python syntax.

This tutorial will contain:

Docstring

Python Docstring is the documentation string which is string literal, and it occurs in the class, module, function or method definition, and it is written as a first statement. Docstrings are accessible from the doc attribute for any of the Python object and also with the built-in help() function can come in handy.

Also, Docstrings are great for the understanding the functionality of the larger part of the code, i.e., the general purpose of any class, module or function whereas the comments are used for code, statement, and expressions which tend to be small. They are a descriptive text written by a programmer mainly for themselves to know what the line of code or expression does. It is an essential part that documenting your code is going to serve well enough for writing clean code and well-written programs. Though already mentioned there are no standard and rules for doing so.

There are two forms of writing a Docstring: one-line Docstrings and multi-line Docstrings. These are the documentation that is used by Data Scientists/programmers in their projects.

One-line Docstrings

The one-line Docstrings are the Docstrings which fits all in one line. You can use one of the quotes, i.e., triple single or triple double quotes and opening quotes and closing quotes need to be the same. In the one-line Docstrings, closing quotes are in the same line as with the opening quotes. Also, the standard convention is to use the triple-double quotes.

def square(a):
    '''Returns argument a is squared.'''
    return a**a

print (square.__doc__)


help(square)
Returns argument a is squared.
Help on function square in module __main__:

square(a)
    Returns argument a is squared.

Here in the above code, you get the printed result:

Returns argument a is squared.
Help on function square in module __main__:

square(a)
    Returns argument a is squared.

In the above Docstrings, you can observe that:

  1. The line begins with a capital letter, i.e., R in our case and end with a period(".").
  2. The closing quotes are on the same line as the opening quotes. This looks better for one-liners.
  3. There's no blank line either before or after the Docstring. It is good practice.
  4. The above line in quotes is more of command than a description which ends with a period sign at last.

Multi-line Docstrings

Multi-line Docstrings also contains the same string literals line as in One-line Docstrings, but it is followed by a single blank along with the descriptive text.

The general format for writing a Multi-line Docstring is as follows:

def some_function(argument1):
    """Summary or Description of the Function

    Parameters:
    argument1 (int): Description of arg1

    Returns:
    int:Returning value

   """

    return argument1

print(some_function.__doc__)
Summary or Description of the Function

    Parameters:
    argument1 (int): Description of arg1

    Returns:
    int:Returning value

The above code outputs:

Summary or Description of the Function

    Parameters:
    argument1 (int): Description of arg1

    Returns:
    int: Returning value

Let's look at the example which can show how the multi-line strings can be used in detail:

def string_reverse(str1):
    """ Returns the reversed String.

    Parameters:
        str1 (str):The string which is to be reversed.

    Returns:
        reverse(str1):The string which gets reversed.   

    """
    reverse_str1 = ''
    i = len(str1)
    while i > 0:
        reverse_str1 += str1[ i - 1 ]
        i = i- 1
    return reverse_str1
print(string_reverse('projkal998580'))
085899lakjorp

You can see above that the summary line is on one line and is also separated from other content by a single blank line. This convention needs to be followed which is useful for the automatic indexing tools.

There are many Docstrings format available, but it is always better to use the formats which are easily recognized by the Docstring parser and also to fellow Data Scientist/programmers. There is no any rules and regulations for selecting a Docstring format, but the consistency of choosing the same format over the project is necessary. Also, It is preferred for you to use the formatting type which is mostly supported by Sphinx. The most common formats used are listed below.

Formatting Type Description
NumPy/SciPy docstrings Combination of reStructured and GoogleDocstrings and supported by Sphinx
Pydoc Standard documentation module for Python and supported by Sphinx
Epydoc Render Epytext as series of HTML documents and a tool for generating API documentation for Python modules based on their Docstrings
Google Docstrings Google's Style

There might be different documentation strings available. You need not need to worry about the fact that you have to reinvent the wheel to study all. The formats of all the Documentation strings are nearly similar. The patterns are similar, but there are only nitty-gritty changes in each format. You"ll be looking over the example of a popular format for documentation string available with their use. At first, you will be seeing the Sphinx Style in detail, and you can easily follow along with other formats easily.

Sphinx Style

Sphinx is the easy and traditional style, verbose and was initially created specifically for the Python Documentation. Sphinx uses a reStructuredText which is similar in usage to Markdown.

class Vehicle(object):
    '''
    The Vehicle object contains lots of vehicles
    :param arg: The arg is used for ...
    :type arg: str
    :param `*args`: The variable arguments are used for ...
    :param `**kwargs`: The keyword arguments are used for ...
    :ivar arg: This is where we store arg
    :vartype arg: str
    '''


    def __init__(self, arg, *args, **kwargs):
        self.arg = arg

    def cars(self, distance, destination):
        '''We can't travel a certain distance in vehicles without fuels, so here's the fuels

        :param distance: The amount of distance traveled
        :type amount: int
        :param bool destinationReached: Should the fuels be refilled to cover required distance?
        :raises: :class:`RuntimeError`: Out of fuel

        :returns: A Car mileage
        :rtype: Cars
        '''  
        pass

Sphinx uses the "keyword(reserved word)", most of the programming language does. But it is called specifically "role" in Sphinx. In the above code, Sphinx has the "param" as a role, and "type" is a role which is the Sphinx data type for "param". "type" role is optional, but "param" is mandatory. The return roles document the returned object. It is different from the param role. The return role is not dependent on the rtype and a vice-versa. The rtype is the type of object returned from the given function.

Google Style

Google Style is easier and more intuitive to use. It can be used for the shorter form of documentation. A configuration of python file needs to be done to get started, so you need to add either sphinx.ext.napoleon or sphinxcontrib.napoleon to the extensions list in conf.py.

class Vehicles(object):
    '''
    The Vehicle object contains a lot of vehicles

    Args:
        arg (str): The arg is used for...
        *args: The variable arguments are used for...
        **kwargs: The keyword arguments are used for...

    Attributes:
        arg (str): This is where we store arg,
    '''
    def __init__(self, arg, *args, **kwargs):
        self.arg = arg

    def cars(self, distance,destination):
        '''We can't travel distance in vehicles without fuels, so here is the fuels

        Args:
            distance (int): The amount of distance traveled
            destination (bool): Should the fuels refilled to cover the distance?

        Raises:
            RuntimeError: Out of fuel

        Returns:
            cars: A car mileage
        '''
        pass

The Google Style is better than Sphinx style. It also has an inconvenient feature, i.e. In the above code, the multi-line description of the distance would look messy. That is why the Numpy can be used for the more extended form of documentation.

Numpy Style

Numpy style has a lot of details in the documentation. It is more verbose than other documentation, but it is an excellent choice if you want to do detailed documentation, i.e., extensive documentation of all the functions and parameters.

class Vehicles(object):
    '''
    The Vehicles object contains lots of vehicles

    Parameters
    ----------
    arg : str
        The arg is used for ...
    *args
        The variable arguments are used for ...
    **kwargs
        The keyword arguments are used for ...

    Attributes
    ----------
    arg : str
        This is where we store arg,
    '''
    def __init__(self, arg, *args, **kwargs):
        self.arg = arg

    def cars(self, distance, destination):
        '''We can't travel distance in vehicles without fuels, so here is the fuels

        Parameters
        ----------
        distance : int
            The amount of distance traveled
        destination : bool
            Should the fuels refilled to cover the distance?

        Raises
        ------
        RuntimeError
            Out of fuel

        Returns
        -------
        cars
            A car mileage
        '''
        pass

The above example is more verbose than any other documentation. It is more lengthy and could only be used for the long and detailed documentation.

Python Built-in Docstrings

You can also view the built-in Python Docstrings.

The all the built-in function, classes, methods have the actual human description attached to it. You can access it in one of two ways.

  1. doc attribute
  2. The help function

For example

import time
print(time.__doc__)
This module provides various functions to manipulate time values.

There are two standard representations of time.  One is the number
of seconds since the Epoch, in UTC (a.k.a. GMT).  It may be an integer
or a floating point number (to represent fractions of seconds).
The Epoch is system-defined; on Unix, it is generally January 1st, 1970.
The actual value can be retrieved by calling gmtime(0).

The other representation is a tuple of 9 integers giving local time.
The tuple items are:
  year (including century, e.g. 1998)
  month (1-12)
  day (1-31)
  hours (0-23)
  minutes (0-59)
  seconds (0-59)
  weekday (0-6, Monday is 0)
  Julian day (day in the year, 1-366)
  DST (Daylight Savings Time) flag (-1, 0 or 1)
If the DST flag is 0, the time is given in the regular time zone;
if it is 1, the time is given in the DST time zone;
if it is -1, mktime() should guess based on the date and time.

The above code will give:

This module provides various functions to manipulate time values.

There are two standard representations of time.  One is the number
of seconds since the Epoch, in UTC (a.k.a. GMT).  It may be an integer
or a floating point number (to represent fractions of seconds).
The Epoch is system-defined; on Unix, it is generally January 1st, 1970.
The actual value can be retrieved by calling gmtime(0).
.............................................
............................................
............................................
.............................................
more description..........................

Similarly, the help can be used by:

help(print)
Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

The above code returns

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

Congrats!

You have made it to the end of this tutorial! Along the way, you have learned about Docstrings which is a fundamental tool used for documentation the programmers and data scientist. You can learn more on Python's website, Python DocStrings PEP257 .

If you would like to learn more about Python, take DataCamp's Intermediate Python for Data Science course.

Want to leave a comment?