Workflow File Reference¶
Workflow files are TOML documents that live in the configured workflows directory. Each file defines exactly one workflow. The filename (without .toml) must match the name field.
File Naming¶
workflows/
├── my-pipeline.toml # name = "my-pipeline"
├── nightly-etl.toml # name = "nightly-etl"
└── deploy-staging.toml # name = "deploy-staging"
Workflow-Level Fields¶
These fields appear at the top of the file, outside any [tasks.*] section.
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Workflow identifier. Must match filename without .toml. |
description | string | no | Human-readable description shown in wf list. |
tags | []string | no | Searchable labels. Used with wf runs --tag. |
on_failure | string | no | Task ID of the global forensic handler. Runs when any task fails. |
Example¶
name = "deploy-production"
description = "Full production deployment with smoke tests"
tags = ["deploy", "production", "nightly"]
on_failure = "alert-oncall"
Task Fields¶
Tasks are declared as TOML tables: [tasks.<id>]. The task ID is the key used in depends_on references.
Required¶
| Field | Type | Description |
|---|---|---|
cmd | string | Shell command to execute. Supports multi-line with """...""". Interpolated with {{.varname}} before execution. |
Identity and Type¶
| Field | Type | Default | Description |
|---|---|---|---|
name | string | task ID | Display name shown in progress output and reports. |
type | string | "normal" | "normal" or "forensic". Forensic tasks only run on failure, wired via on_failure. |
Dependencies¶
| Field | Type | Default | Description |
|---|---|---|---|
depends_on | []string | [] | List of task IDs that must complete successfully before this task runs. |
Execution Environment¶
| Field | Type | Default | Description |
|---|---|---|---|
working_dir | string | process CWD | Directory in which cmd is executed. Validated against path traversal at parse time. |
env | map[string]string | {} | Additional environment variables injected into cmd. Keys must be valid POSIX names. |
clean_env | bool | false | If true, start with an empty environment instead of inheriting the parent process env. WF_RUN_ID, WF_WORKFLOW, and WF_TASK_ID are still injected. |
Variables¶
| Field | Type | Default | Description |
|---|---|---|---|
register | string | — | Variable name to store the last non-empty stdout line. Available in downstream tasks as {{.name}}. |
if | string | — | Boolean expression. Task is skipped (not failed) if it evaluates to false. References registered variables by name. |
Matrix Expansion¶
| Field | Type | Default | Description |
|---|---|---|---|
matrix | map[string][]string | — | Parameter grid. Each key-value-array combination produces one expanded node. |
Resilience¶
| Field | Type | Default | Description |
|---|---|---|---|
ignore_failure | bool | false | If true, non-zero exit codes are treated as success. Dependants continue normally. |
retries | int | 0 | Number of retry attempts after a non-zero exit. Total executions = retries + 1. |
retry_delay | duration | 0s | Delay between retry attempts. Supports Go duration syntax: 5s, 1m30s, 2h. |
timeout | duration | — | Per-task execution timeout. The task is killed (SIGKILL to process group) if it exceeds this limit. |
Failure Handling¶
| Field | Type | Default | Description |
|---|---|---|---|
on_failure | string | — | Task ID of a forensic handler to run when this task fails. The referenced task must have type = "forensic". |
cmd — Command Syntax¶
cmd is executed by /bin/sh -c (Unix) or cmd.exe /C (Windows). All standard shell syntax is available.
Multi-line commands¶
[tasks.setup]
cmd = """
set -euo pipefail
mkdir -p /tmp/workspace
cd /tmp/workspace
echo "Ready"
"""
Variable interpolation in cmd¶
Only {{.varname}} substitution is supported. Template logic ({{if}}, {{range}}, {{call}}) is emitted verbatim — it will cause the shell command to fail visibly.
env — Environment Variables¶
Keys must satisfy POSIX naming rules: letters, digits, underscore; must not start with a digit.
The following keys are blocked and will cause a parse error:
| Blocked Key | Reason |
|---|---|
LD_PRELOAD | Dynamic-linker preload override |
LD_LIBRARY_PATH | Dynamic-linker library path override |
LD_AUDIT | Dynamic-linker audit module |
LD_DEBUG | Dynamic-linker debug flag |
LD_DEBUG_OUTPUT | Dynamic-linker debug output redirect |
DYLD_INSERT_LIBRARIES | macOS dynamic-linker insert override |
DYLD_LIBRARY_PATH | macOS dynamic-linker library path |
DYLD_FRAMEWORK_PATH | macOS dynamic-linker framework path |
if — Condition Expressions¶
The condition is evaluated after all depends_on tasks complete. If it evaluates to false, the task state is set to skipped and dependants continue as if it succeeded.
Operators¶
Examples¶
if = 'status == "healthy"'
if = 'cert_days < "30"'
if = 'errors != "0"'
if = 'score >= "0.9" && model_type == "transformer"'
if = 'env == "production" || env == "staging"'
matrix — Expansion¶
[tasks.test]
cmd = "go test ./{{.test.pkg}}/..."
matrix = {pkg = ["api", "storage", "executor", "contextmap"]}
Produces: test[pkg=api], test[pkg=storage], test[pkg=executor], test[pkg=contextmap].
Matrix variables are referenced in cmd as {{.taskid.paramname}}.
Multi-dimensional:
[tasks.build]
cmd = "GOOS={{.build.os}} GOARCH={{.build.arch}} go build -o bin/app-{{.build.os}}-{{.build.arch}} ."
matrix = {os = ["linux", "darwin"], arch = ["amd64", "arm64"]}
Produces 4 nodes (2×2).
Forensic Tasks¶
[tasks.rollback-db]
type = "forensic"
cmd = "psql -c 'ROLLBACK TO SAVEPOINT pre_deploy;'"
timeout = "2m"
retries = 1
Forensic tasks:
- Must have
type = "forensic" - Are excluded from normal DAG execution (not counted in levels)
- Run only when wired via
on_failure - Support all standard task fields (
timeout,retries,retry_delay,env,clean_env,ignore_failure) - Have access to
{{.failed_task}}and{{.error_message}}injected variables
Duration Syntax¶
timeout and retry_delay use Go duration syntax:
| Example | Meaning |
|---|---|
30s | 30 seconds |
5m | 5 minutes |
1m30s | 1 minute 30 seconds |
2h | 2 hours |
24h | 24 hours |
Complete Example¶
name = "data-pipeline"
description = "Nightly data processing pipeline"
tags = ["etl", "nightly", "data"]
on_failure = "notify-failure"
# ── Setup ─────────────────────────────────────────────────
[tasks.init]
name = "Initialise Run"
cmd = """
DATE=$(date +%Y-%m-%d)
echo "Starting pipeline for $DATE"
echo "$DATE"
"""
register = "run_date"
# ── Parallel extraction ────────────────────────────────────
[tasks.extract-api]
name = "Extract from API"
cmd = "python extract_api.py --date={{.run_date}} && echo 12500"
register = "api_rows"
depends_on = ["init"]
ignore_failure = true
retries = 2
retry_delay = "30s"
timeout = "15m"
env = {API_TOKEN = "{{.run_date}}", MAX_RETRIES = "3"}
[tasks.extract-db]
name = "Extract from Database"
cmd = "python extract_db.py --date={{.run_date}} && echo 84200"
register = "db_rows"
depends_on = ["init"]
ignore_failure = true
timeout = "20m"
clean_env = true
# ── Transform ─────────────────────────────────────────────
[tasks.transform]
name = "Transform and Merge"
cmd = """
python transform.py \
--api-rows={{.api_rows}} \
--db-rows={{.db_rows}} \
--date={{.run_date}}
echo 96700
"""
register = "merged_rows"
depends_on = ["extract-api", "extract-db"]
timeout = "30m"
working_dir = "/data/pipeline"
# ── Load ──────────────────────────────────────────────────
[tasks.load]
name = "Load to Warehouse"
cmd = "python load.py --rows={{.merged_rows}}"
depends_on = ["transform"]
retries = 3
retry_delay = "1m"
timeout = "1h"
# ── Validate ──────────────────────────────────────────────
[tasks.validate]
name = "Validate Row Count"
cmd = "echo 'Loaded {{.merged_rows}} rows successfully'"
depends_on = ["load"]
if = 'merged_rows > "0"'
# ── Forensic handler ──────────────────────────────────────
[tasks.notify-failure]
type = "forensic"
cmd = """
echo "Pipeline failed at task: {{.failed_task}}"
echo "Error: {{.error_message}}"
curl -s $SLACK_WEBHOOK -d "{\"text\": \"ETL failed: {{.failed_task}}\"}"
"""
ignore_failure = true
timeout = "30s"