Skip to content

Pre-commit hooks

Another alternative is to try to catch code quality issues before any code is even sent to the remote git repo. Pre-commit hooks are essentially actions that are taken right before code is committed to your (local) repo.

Pre-commit illustrated

Pre-commit package

There are different ways to create new hooks to your git repo. pre-commit is a package to easily config pre-commit hooks, and store them in a very readable manner.

Installation

To install, simply run:

pip install pre-commit

Usage

Configuration

To use pre-commit, create a .pre-commit-config.yaml in the root of your project. There, include

repos:
-   repo: https://github.com/datarootsio/databooks
    rev: 0.1.3
    hooks:
    -   id: databooks
        args: ["--overwrite"]

databooks repo has minimal configuration (such as the meta command). The rev parameter indicates the version to use and args indicate additional arguments to pass to the tool. In the example above, we opt to overwrite the files.

The pre-commit tool doesn't actually commit any changes if the staged files are modified. Therefore, if there is any unwanted metadata at the time of committing the changes, the files would be modified, no commit would be made, and it'd be up to the developer to inspect the changes, add them and commit. That's why we recommend specifying args: ["--overwrite"].

You can require the developer to manually remove metadata, by specifying args: ["--check"]. In this case the commit would still be aborted, but no files would be changed. The .pre-commit-config.yaml would look like

repos:
-   repo: https://github.com/datarootsio/databooks
    rev: 0.1.3
    hooks:
    -   id: databooks
        args: ["--check"]

Running

Once the configuration is in place all the user needs to do to trigger pre-commit is to commit changes normally

$ git add path/to/notebook.ipynb
$ git commit -m 'a clear message'
databooks................................................................Failed
- hook id: databooks
- files were modified by this hook

[20:51:08] WARNING  1 files will be overwritten                        cli.py:81
  Removing metadata ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
           INFO     The metadata of 1 out of 1 notebooks were         cli.py:114
                    removed!

Alternatively, one could run pre-commit run to manually run the same command that is triggered right before committing changes. Or, one could run pre-commit run --all-files to run the pre-commit hooks in all files (regardless if the files have been staged or not). The later is useful as a first-run to ensure consistency across the git repo or in CI.

Back to top