Pre-commit hooks
Another alternative is to try to catch code quality issues before any code is even sent
to the remote git repo. Pre-commit hooks
are essentially actions that are taken right
before code is committed to your (local) repo.
Pre-commit
package
There are different ways to create new hooks to your git repo. pre-commit
is a package to easily config pre-commit hooks, and store them in a very readable manner.
Installation
To install, simply run:
pip install pre-commit
Usage
Configuration
To use pre-commit
, create a .pre-commit-config.yaml
in the root of your project.
There, include
repos:
- repo: https://github.com/datarootsio/databooks
rev: 0.1.3
hooks:
- id: databooks
args: ["--overwrite"]
databooks
repo has minimal configuration
(such as the meta
command). The rev
parameter indicates the version to use and args
indicate additional arguments to pass to the tool. In the example above, we opt to
overwrite the files.
The pre-commit
tool doesn't actually commit any changes if the staged files are modified.
Therefore, if there is any unwanted metadata at the time of committing the changes,
the files would be modified, no commit would be made, and it'd be up to the developer to
inspect the changes, add them and commit. That's why we recommend specifying args:
["--overwrite"]
.
You can require the developer to manually remove metadata, by specifying args: ["--check"]
.
In this case the commit would still be aborted, but no files would be changed. The
.pre-commit-config.yaml
would look like
repos:
- repo: https://github.com/datarootsio/databooks
rev: 0.1.3
hooks:
- id: databooks
args: ["--check"]
Running
Once the configuration is in place all the user needs to do to trigger pre-commit
is
to commit changes normally
$ git add path/to/notebook.ipynb
$ git commit -m 'a clear message'
databooks................................................................Failed
- hook id: databooks
- files were modified by this hook
[20:51:08] WARNING 1 files will be overwritten cli.py:81
Removing metadata ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
INFO The metadata of 1 out of 1 notebooks were cli.py:114
removed!
Alternatively, one could run pre-commit run
to manually run the same command that is
triggered right before committing changes. Or, one could run pre-commit run --all-files
to run the pre-commit hooks in all files (regardless if the files have been staged or not).
The later is useful as a first-run to ensure consistency across the git repo or in CI.