A Guide to Python's __init__.py

An explanation of the purpose of the __init__.py file in Python. Learn how it's used to mark directories as Python packages and how you can use it to control your package's namespace.

When you browse the source code of a Python project, you will often see a file named __init__.py inside the directories. This file might be empty, or it might contain a few lines of code. But what does it actually do? And is it still necessary?

The __init__.py file has two main purposes:

  1. It marks a directory as a Python package.
  2. It can be used to run initialization code or to control the package's namespace.

Marking a Directory as a Package

This is the historical and most important role of __init__.py. In order for the Python interpreter to treat a directory as a package from which you can import modules, it must contain an __init__.py file.

Consider this directory structure:

my_project/
    main.py
    my_package/
        __init__.py
        module1.py

Because my_package contains an __init__.py file, Python knows that it's a package. This allows you to write the following in main.py:

# main.py
from my_package import module1

If you were to delete the __init__.py file, this import statement would fail with a ModuleNotFoundError (in older versions of Python).

Note on Modern Python (3.3+):

Since Python 3.3, a new type of package called a namespace package was introduced. These allow the creation of packages that span multiple directories and they do not require an __init__.py file. However, for a simple, single-directory package, __init__.py is still the standard and most explicit way to define it. An empty __init__.py file is perfectly fine and is all that's needed to mark a directory as a regular package.

Running Initialization Code

When a package is imported, the code inside its __init__.py file is executed. This allows you to perform any necessary initialization for the package.

For example:

# my_package/__init__.py

print("Initializing my_package...")

Now, when you run import my_package, the message "Initializing my_package..." will be printed to the console.

Controlling the Package's Namespace

A more advanced and very common use of __init__.py is to control what gets exposed when a user imports your package. It allows you to create a cleaner and more convenient API for your package.

Let's say module1.py contains a useful function:

# my_package/module1.py

def useful_function():
    return "Hello!"

Without modifying __init__.py, a user would have to import this function like this:

from my_package.module1 import useful_function

useful_function()

This can be a bit verbose. We can make useful_function feel like it belongs directly to my_package by importing it inside __init__.py.

# my_package/__init__.py

from .module1 import useful_function

Now, a user of your package can import the function directly from the package:

from my_package import useful_function

useful_function()

This is a much cleaner API. You are effectively promoting the function from a submodule up to the top level of your package's namespace.

Defining __all__

You can also use __init__.py to control what is imported when a user performs a wildcard import (from my_package import *). You can do this by defining a list called __all__.

# my_package/__init__.py

from .module1 import useful_function
from .module2 import another_function

__all__ = ['useful_function']

Now, if a user writes from my_package import *, only useful_function will be imported, not another_function.

Conclusion

The __init__.py file is a key part of the Python packaging system. While it can be an empty file, it plays the crucial role of identifying a directory as a Python package. Furthermore, you can use it as a powerful tool to run initialization code and to design a clean, user-friendly API for your package by carefully controlling its namespace.