Configuration
Instead of passing the same parameters every time when running a command, it is also possible to set up a configuration that will be read and override the defaults. The order of priority (from higher priority to lower)
- User input arguments in the CLI
- Configuration file
- Defaults
So it's still possible to override the configuration file via CLI parameters (as expected).
What can I configure?
All CLI parameters are actually configurable, so you can pass specify anything that is
also available to you via the UI, with one exception: the required PATHS
argument.
This is because the PATHS
argument is also used for finding your configuration (see
how can I use it for more information).
Info
Remember that flags are parsed as boolean values. So you can specify --verbose
on
the configuration as verbose=true
.
How does it look like?
The configuration file is a pyproject.toml
file that you can place at the root of your
project. There, you can specify values for either command under the [tool.databooks.<command>]
.
So if, for example, the desired behavior is
databooks meta
- Remove outputs
- Don't remove execution count
- Always overwrite files
databooks fix
- Keep notebook metadata from
base
(nothead
) databooks assert
- Always check that notebook has less than 10 cells
The pyproject.toml
file would look like
[tool.databooks.meta]
rm-outs = true
rm_exec = false
overwrite = true
[tool.databooks.fix]
metadata-head = false
[tool.databooks.assert]
expr = ["len(nb.cells) < 10"]
How can I use it?
There are 2 ways to specify the configuration file: explicitly and implicitly. You can
explicitly specify the pyproject.toml
via the --config
parameter. If none is specified,
then databooks
will look for a pyproject.toml
in your project.
databooks
will look for the configuration file by first finding the common directory
between all the target paths and from there recursively go to the parent directories
until either finding the configuration file or the root of the git repo. That way, you can
have multiple configuration files and depending on where your notebooks are located the
correct values will be used (think monorepo).
Tip
databooks
has a verbose
concept that will print more information to the terminal
if desired. For debugging purposes one can still increase the verbosity by setting
and environment variable LOG_LEVEL
to DEBUG
. That way, one can get information,
among many other things, of the configuration file used.