How should I structure my repository to share my code easily? Which one of the bazillion tools out there should I use?
Python has many ways to set up your project, manage your dependencies, and publish your code to PyPI. The Python community is divided on the best way to package and distribute code, leading to a lot of debate and far too many tools. If you've ever explored these options you've likely encountered setup.py
, requirements.txt
, setup.cfg
, MANIFEST.in
and Pipfile
among others😵.
In this post, I share what I believe is the simplest and most effective way to publish to pypi.org. so you can easily share your project with the world🌎. You can always make things more elaborate as your project's needs grow.
Project Structure
First off we need to understand a few key things before we get started. Just like your significant other💘, Python has some expectations if you want your project to be recognized as a Python package and published to pypi.org😄. If you'd like to follow along, create a GitHub repository from this template and clone it to your local machine.
Let's review the project structure from top to bottom. We won't get into much detail for the first two folders for the sake of time. These are .vscode
for VS Code settings files and docs
is for all your documentation files. There are also many tool configuration files like .editorconfig
, pre-commit-confi.yaml
, .python-version
cspell.json
and .markdown-lint.jsonc
which I won't cover in this post. However, I've provided brief descriptions for some of these in the repository's README.md file.
Source Code
let's talk about organizing your source code folder. It's good practice to keep your code in a folder named src
or source
so it's clear where everything is located. Python build tools are familiar with this folder name and know where to look for your code. Adopting this practice is beneficial for any codebase, especially as it grows because it improves readability and understanding.
Name your Package
The folder inside your src
folder should ideally match the package name you plan to publish to PyPI. This package name also needs to match the one you specify in the pyproject.toml
file, which we'll discuss shortly.
Add the __init__.py
File
The __init__.py
file is required for Python to treat a directory as a package. You can think of it like the class constructor def __init__(self):
method, as it executes whenever your module is imported. It's good software practice to avoid executing code on imports. As it is generally not thought of as a computationally expensive operation with minimal side effects. So, be careful not to add too much code to the __init__.py
file. I personally leave it empty, as I prefer to have my users explicitly import only what they need.
Metadata
The pyproject.toml
file is where you'll find all the knobs and dials to configure every tool to behave exactly as you want. It also contains all the metadata for your project, such as the name, authors, version, and description. Here are the most important parts you need to understand.
Build System Requirements
This part is required and depends on our tool of choice and since we are using poetry to build our package it will be as follows. More on poetry in a bit.
Package Information
Now, let's update your package information. This is the minimum data required for PyPI to accept your package. For more options and information, I recommend checking out the Poetry documentation. Here's an example of the necessary fields:
[tool.poetry]
name = "mypackage"
version = "0.1.0"
description = "My package description"
license = "MIT"
authors = ["John Doe <john@doe.com>"]
readme = "README.md"
homepage = "https://github.com/tomoum/python"
repository = "https://github.com/tomoum/python"
documentation = "https://github.com/tomoum/python#readme"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
Install Your Executables to the CLI
[tool.poetry.scripts]
mypackage = 'mypackage.main:main'
This is where things get really cool 🙌. By adding these script config lines you can specify CLI aliases and execute them from the command line then trigger any function you specify. You can even add multiple statements to execute separate functions. The format is package_name.module_name:function_name
. e.g to run a function called potato
in a file called mashed.py
in mypackage
and have it run when you type gravy
in the command line you would do this:
[tool.poetry.scripts]
mypackage = 'mypackage.main:main'
gravy = 'mypackage.mashed:potato'
Develop
In my previous post, we talked about what a development environment is and what you need to consider when setting one up. We used pipenv
to manage the virtual environment and our package dependencies. This time we will use poetry as it combines managing dependencies, virtual environments, building your wheel, and publishing to PyPi all in one tool. On top of that it doesn't add a bazillion config files in order to achieve it. It keeps your project clean and simple by moving all the configs to pyproject.toml
.
- To create the virtual environment, you'll need to install
poetry
on our machine.
pip install poetry
2. In one command tell poetry to create your virtual environment, and install all of our dependencies from our pyproject.toml
file into it. By default, this will also install any developer packages specified [tool.poetry.group.dev.dependencies]
poetry install
3. Add any packages your project depends on using
poetry add <package-name>
To add development-only packages like linters or formatters run
poetry add --group dev <package-name>
4. Launch the virtual environment we just created so we can access the CLI tools we specified in [tool.poetry.scripts]
.
poetry shell
Alternatively, if you want to install your package to the global Python interpreter but you still want to develop it and make changes you can run this in the root directory of your project. Be careful though any changes to metadata will require you to run this again.
pip install -e .
Build
Once you are happy with your code and you've tested it. We can now build your package wheels
. If you don't know what that is don't worry nobody does🤔. Just kidding as I mentioned before distributing Python packages is a chaotic thing. The reason is Python was developed in 1991 before the internet so this element of the language was not really baked in.
All you need to know is that wheels are a modern binary package format for distributing Python packages. It is intended to make installing packages faster and more reliable, as they typically contain pre-compiled code, which means that users don't need to compile the code themselves. This is especially helpful for packages with C extensions or other non-Python code, as users don't need to have a specific build environment set up to install the package.
- To build your package just run this from the root directory:
poetry build
Notice it will create a dist
folder with 2 files inside the mypackage-0.1.0-py3-none-any.whl
wheel and the mypackage-0.1.0.tar.gz
tar file which contains your source code.
Publish
One thing to note here is you don't necessarily need to publish to PyPi you can host those packages in your company's private servers or set up your own using for example an AWS server. Something to keep in mind.
One thing I always like to do is first test everything on Test PyPi before releasing it to the official index PyPi just to make sure everything is in tip-top shape. The last thing you want is to publish a broken package and break things for everyone who depends on this package.
- First, you need to create an account on PyPi and Test PyPi.
- Add the test PyPi URL the official one is already there by default.
poetry config repositories.test-pypi https://test.pypi.org/legacy/
3. Get a token from https://test.pypi.org/manage/account/token/
4. Add this token to poetry so that it can access your account for publishing
poetry config pypi-token.test-pypi <your-copy-pasted-token>
5. Now test publishing your code to test PyPi
poetry publish -r test-pypi
6. Test installing that package
pip install --index-url https://test.pypi.org/simple/ <package-name>
7. Once you've confirmed all is good and you are ready to publish to the official index. Get a token from your PyPi account from https://test.pypi.org/manage/account/token/ and add it to your local machine
poetry config pypi-token.pypi <your-copy-pasted-token>
8. You are now ready to publish your package. 🙌🕺💃🎊🥳🍕
poetry publish
Hey, If you run into any issues or find any errors make sure to leave a comment below or open an issue on the Github repo