Contributing

We are always looking for contributions! You can find below some relevant information and standards for databooks.

Setup ⚙️

After cloning the repo, make sure to set up the environment.

Poetry 📜

We use Poetry for both managing environments and packaging. That means you need to install poetry but from there you can use the tool to create the environment.

pip install poetry==1.1.12
poetry install  # installs prod and dev dependencies

Usage

Remember that to use the environment you can use the poetry run <COMMAND> command or initialize the shell with poetry shell. For example, if you want to create the coverage report you could run

poetry run pytest --cov=databooks tests/

or alternatively

poetry shell
pytest --cov=databooks tests/

Development 🛠

We welcome new features, bugfixes or enhancements (whether on code or docs). There are a few standards we adhere to, that are required for new features.

Mypy

We use type hints! Not only that, they are enforced and checked (with Mypy). This is actually the reason for supporting Python 3.8+. There are a couple of reasons for using type hints, mainly:

Better code coverage (avoid errors during runtime)
Improve code understanding
As databooks uses both Typer and Pydantic, types are not only for developer hints, but they are also used to cast notebook (JSON) values to the correct types as well as user inputs in the CLI

If you are not familiar with type hints and Mypy, a good starting point is watching the Type-checked Python in the real world - PyCon 2018 talk.

Linting

In regards to code quality, we use a couple of linting tools to maintain the same "style" and uphold to the same standards. For that, we use:

Black for code formatting
isort for imports
Flake8 for style enforcement

Docs 📚

As for documentation, the databooks documentation "lives" both on the code itself and supporting documentation (markdown) files.

Code

Code docs include annotating type hints as well as function docstrings. For that, we use the reStructuredText -like format.

Providing docstrings not only give a clear way to document the code, but it is also picked up by MkDocs.

MkDocs

MkDocs gives a simple way to write markdown files that get rendered as HTML (under a certain theme) and served as documentation. We use MkDocs with different extensions. We use mkdocstrings to link function docstrings with the existing documentation.

You can check the generation of documentation by running from the project root

mkdocs serve

mike

We also use the mike plugin in MkDocs to publish and keep different versions of documentation.

Cog

We use cog to dynamically generate parts of the documentation. That way, code changes trigger markdown changes as well.

Pre-commit

Pre-commit is the tool that automates everything, eases the workflow and run checks in CI/CD. It's highly recommended installing pre-commit and the hooks during development.

Tests 🗳

We use unit tests to ensure that our package works as expected. We use pytest for testing and Pytest-cov for checking how much of the code is covered in our tests.

The tests should mimic the package directory structure. The tests are also written to serve as an example of how to use the classes and methods and expected outputs.

The coverage is also added to the documentation. For that we use MkDocs Coverage Plugin. For that we need a htmlcov/ directory that is generated by Pytest-cov by running from the project root

pytest --cov-report html --cov=databooks tests/

Publishing

Publishing is automatically done via Github Actions to PyPI. After published, a new tag and release are created. A new docs version is also published if all previous steps are successful.

Contributors 👨‍💻👩‍💻

databooks was created by Murilo Cunha, and is maintained by dataroots.

Acknowledgements

Special thanks to:

Bart, Nick and Freddy for feedback and support.