Skip to content

02 - Everyday Project Packaging With pyproject.toml

Snakesay project

Project structure

  • ./projects
  • /snakesay
    • main.py
    • snake.py

The main.py file

When you pass a folder to the python interpreter call; python will internally search for a file named __main__.py and if it exists, it will execute it.

Calling with the -m flag

# cwd is projects

# Runs fine
python snakesay

# Prints an error
python -m snakesay
# import snake. No module named 'snake'

With the -m flag, Python will modify the PYTHONPATH environment variable to include the directory containing the module (that is, snakesay. You should use import snakesay.snake). This allows Python to find the module when you use absolute import statements in the module.

Absolute imports

When you ran python snakesay, nothing is added to the PYTHONPATH and your only hope is to give relative import statements to the current file in order to find your module. That is bad practice. Running python -m is a good way to emulate how your package should behave when distributed.

The issue is that we are now obliged to call our script with the -m flag if we use absolute imports. But that doesn't need to be the case. We can simply install our package (keep using absolute imports) and we have this issue solved.

python -m

By calling like this, you can also start your module from anywhere and not only from inside your module parent folder.

sys.path.append

Avoid adding stuff to the sys.path as much as you can. It doesn't scale well.

Packaging History

  • Python appeared almost at the same time as the internet (nineties) and distributions were not on the radar. At that time, people simply exchange python files via email and all the code was contained in a single file (like shell scripts)
  • Around Python 2 the distutils module appeared with the goal of faciliate distribution of python code.
Danger

The distutils is deprecated and will be removed in Python 3.12.

  • setuptools took over distutils. It is not part of the python library, but it was much more powerful than distutils and it sort of imposed its style and defined new standards.

  • But then the Python community started to standardize some procedures in the language via PEPs. Packaging is one of these standardized procedures.

Fundamental PEPs for packaging standard
  • PEP 427: describes how wheels should be packaged.
  • PEP 440: describes how version numbers should be parsed.
  • PEP 508: describes how dependencies should be specified.
  • PEP 517: describes how dependencies should be specified.
  • PEP 518: describes how a build system should be specified.
  • PEP 621: describes how project metadata should be written.
  • PEP 660: describes how editable installs should be performed.

The new packaging standard

We don't need setup.py or setup.cfg anymore. In fact, we just need one single file to configure our project for packaging. This file is pyproject.toml

Hint

We can add configuration of other common python project tools such as black and mypy in pyproject.toml

Build systems (backends)

  • setuptools
  • poetry
  • flit
Frontend build

The frontend systems interface with the backend systems to generate the source distribution (sdist) from the project source and a binary distribution (wheel) from sdist. build is an example of frontend build system (pip install build; python -m build; pyproject-build)

The build systems offer different features. setuptools is a very general one and you can basically do anything with it, but it may be quite complicated if you have a pure python project (that is, without C extensions). In this case, we may take a look at flit.

Console scripts

You can define an entrypoint for you package in the pyproject.toml.

[projects.scripts]
# snakey = "<package>.<module>:<function>"
snakey = "snakesay.__main__:main"

Source distribution (sdist) and Binary distributions (wheel)

In pure python packages, it is sufficient to have a sdist to distribute your project. To cover all the cases you need a wheel distribution.

A wheel is a binary distribution of your package. A naming convention exists such that it reflects the system in which it was built. This is important if your project has pieces of code that are written in C and need to be compiled. Since compilation is system-dependent, we have to compile once for every system architecture we want to support.

Info

The build backend is supposed to automatically identify if you need system-dependent builds. If your project is pure python, only one wheel needs to be generated and it will be named to something similar to:-py3-none-any.whl

.egg-info

Egg is the predecessor of wheel, but the folder .egg-info it still generated when a package is installed. You'll find metadata information of the package there that is used for modules such as importlib.metadata.

Exercise

  • When someone should use toml, json or yaml?
  • Read more about the Python import system here
  • Read more about Python Wheels here