Config reference: evaluation
-
Enterprise tuning surface
Defaults + constraints are rendered directly from Pydantic.
-
Env keys when available
Many fields have an env-style alias (from
TriBridConfig.to_flat_dict()). -
Tooltip-level guidance
If a matching glossary entry exists, you’ll see deeper tuning notes.
Config reference Config API & workflow Glossary
Total parameters: 8
Group index
(root)
(root)
| JSON key | Env key(s) | Type | Default | Constraints | Summary |
|---|---|---|---|---|---|
evaluation.baseline_path | BASELINE_PATH | str | "data/evals/eval_baseline.json" | — | Baseline results path |
evaluation.eval_dataset_path | EVAL_DATASET_PATH | str | "data/evaluation_dataset.json" | — | Evaluation dataset path |
evaluation.eval_multi_m | EVAL_MULTI_M | int | 10 | ≥ 1, ≤ 20 | Multi-query variants for evaluation |
evaluation.ndcg_at_10_k | — | int | 10 | ≥ 1, ≤ 200 | K used for ndcg_at_10 metric (default 10). |
evaluation.precision_at_5_k | — | int | 5 | ≥ 1, ≤ 200 | K used for precision_at_5 metric (default 5). |
evaluation.recall_at_10_k | — | int | 10 | ≥ 1, ≤ 200 | K used for recall_at_10 metric (default 10). |
evaluation.recall_at_20_k | — | int | 20 | ≥ 1, ≤ 200 | K used for recall_at_20 metric (default 20). |
evaluation.recall_at_5_k | — | int | 5 | ≥ 1, ≤ 200 | K used for recall_at_5 metric (default 5). |
Details (glossary)
evaluation.baseline_path (BASELINE_PATH) — Baseline Path
Category: general
Directory where evaluation loop saves baseline results for regression testing and A/B comparison. Each eval run's metrics (Hit@K, MRR, latency) are stored here with timestamps. Use this to ensure retrieval quality doesn't regress after configuration changes, reindexing, or model upgrades. Compare current run against baseline to detect improvements or degradations.
Links: - Regression Prevention