Tracking the runs of the workflow🔗
Since most of the time workflows are run multiple times, it can be useful to have a
history of past runs and their results. For this bandsaw comes with a
bandsaw.tracking.tracker.TrackerExtension
.
This extension tracks information about the different parts of a workflow, be it the
run itself or the executed tasks with their result. This information is then send to
a backend where the data is stored and can be accessed and integrated into ones own
process and monitoring solution. Currently, the only available backend is the
bandsaw.tracking.filesystem.FileSystemBackend
which stores the data in json format in a local directory.
Configuration🔗
backend (bandsaw.tracking.backend.Backend)🔗
backend
is a required positional argument. It contains the pre-configured backend
instance that will be used for storing the tracking information. The object has to
inherit from type
bandsaw.tracking.backend.Backend
.
Example configuration🔗
import bandsaw
from bandsaw.tracking.tracker import TrackerExtension
from bandsaw.tracking.filesystem import FileSystemBackend
tracking_directory = '/path/to/my/tracking/directory'
configuration = bandsaw.Configuration()
configuration.add_extension(TrackerExtension(FileSystemBackend(tracking_directory)))
Backends🔗
All backends need to inherit from the base class
bandsaw.tracking.backend.Backend
which defines the interface that is being expected from the tracker extension.
It defines a set of methods which take the information about the workflow:
def track_run(self, ids, run_info):
pass
def track_task(self, ids, task_info):
pass
def track_execution(self, ids, execution_info):
pass
def track_session(self, ids, session_info):
pass
def track_result(self, ids, result_info):
pass
def track_attachments(self, ids, attachments):
pass
def track_distribution_archive(self, distribution_archive):
pass
All *_infos
that are passed over are dictionaries which contain the information.
FileSystemBackend🔗
The first available backend is the
bandsaw.tracking.filesystem.FileSystemBackend
which writes the info objects to the file system. For this it takes a path to a directory
as configuration, where the individual *-info.json
files are stored.
The directory layout follows a specific schema:
.tracking/
├── runs
│  ├── 025c8e7e-6992-11ec-8e2d-48f17f64520d
│  │  ├── 0d268ac06a82654e_76560cb43662cc8f_025c8e7e-6992-11ec-8e2d-48f17f64520d
│  │  └── run-info.json
│  └── 6aa525e8-698f-11ec-8e2d-48f17f64520d
│  ├── 0d268ac06a82654e_76560cb43662cc8f_6aa525e8-698f-11ec-8e2d-48f17f64520d
│  └── run-info.json
└── tasks
└── 0d268ac06a82654e
├── 76560cb43662cc8f
│  ├── 025c8e7e-6992-11ec-8e2d-48f17f64520d
│  │  ├── attachments
│  │  │  ├── metrics.json
│  │  │  └── session.log
│  │  ├── result-info.json
│  │  └── session-info.json
│  ├── 6aa525e8-698f-11ec-8e2d-48f17f64520d
│  │  ├── attachments
│  │  │  ├── metrics.json
│  │  │  └── session.log
│  │  ├── result-info.json
│  │  └── session-info.json
│  └── execution-info.json
└── task-info.json
tasks
contains on its level a single directory for each individual task whose name
is the task_id
of the containing task. The task directory contains a single
task-info.json
file, which contains a json object with some information about this
specific tasks. Every execution of this task is stored in a separate subdirectory named
with the execution_id
of the corresponding execution. These directories contain a
execution-info.json
file with meta information about the arguments used in this
execution and subdirectories named by the run_id
for every specific run, where the
task was executed with these specific arguments.
This directory contains then an session-info.json
with information about the session
of this execution and the result-info.json
which describes the computed result.
If attachments where created by the advice chain while executing the task, those are
stored in the attachments
directory.
Additionally, in the root of the tracking directory a directory runs
stores
information about the executed sessions for each individual run. Every run has their
own specific directory named by its run_id
This directory contains a run-info.json
and empty files named with the session_ids of every session that was computed during
this run.
Example JSON info file format🔗
All *-info.json
files have the same format and share some of their content. An example
can be found below. Most files contain only a subset of this example data.
{
"task": {
"id": "0d268ac06a82654e",
"definition": "demo.greet",
"advice_parameters": {}
},
"execution": {
"id": "76560cb43662cc8f",
"arguments": [
{
"type": "str",
"value": "Christoph",
"size": "9",
"name": "recipient"
}
]
},
"run": {
"id": "6aa525e8-698f-11ec-8e2d-48f17f64520d"
},
"configuration": "bandsaw_config",
"distribution_archive": {
"modules": [
"__main__",
"bandsaw",
"bandsaw_config",
"multimeter"
],
"id": null
},
"session": {
"id": "0d268ac06a82654e_76560cb43662cc8f_6aa525e8-698f-11ec-8e2d-48f17f64520d"
},
"result": {
"value": {
"type": "int",
"value": "16421"
}
}
}