Skip to content

Config reference: training

  • Enterprise tuning surface


    Defaults + constraints are rendered directly from Pydantic.

  • Env keys when available


    Many fields have an env-style alias (from TriBridConfig.to_flat_dict()).

  • Tooltip-level guidance


    If a matching glossary entry exists, you’ll see deeper tuning notes.

Config reference Config API & workflow Glossary

Total parameters: 36

Group index
  • (root)

(root)

JSON key Env key(s) Type Default Constraints Summary
training.learning_reranker_backend LEARNING_RERANKER_BACKEND Literal["auto", "mlx_qwen3"] "auto" allowed="auto", "mlx_qwen3" Learning reranker backend: auto (prefer MLX Qwen3 on Apple Silicon), mlx_qwen3 (force). Legacy values 'transformers'/'hf' normalize to 'auto'.
training.learning_reranker_base_model LEARNING_RERANKER_BASE_MODEL str "Qwen/Qwen3-Reranker-0.6B" Base model to fine-tune for MLX Qwen3 learning reranker
training.learning_reranker_grad_accum_steps LEARNING_RERANKER_GRAD_ACCUM_STEPS int 8 ≥ 1, ≤ 128 Gradient accumulation steps per optimizer update for MLX Qwen3 learning reranker training
training.learning_reranker_lora_alpha LEARNING_RERANKER_LORA_ALPHA float 32.0 > 0.0, ≤ 512.0 LoRA alpha for MLX Qwen3 learning reranker
training.learning_reranker_lora_dropout LEARNING_RERANKER_LORA_DROPOUT float 0.05 ≥ 0.0, ≤ 0.5 LoRA dropout for MLX Qwen3 learning reranker
training.learning_reranker_lora_rank LEARNING_RERANKER_LORA_RANK int 16 ≥ 1, ≤ 128 LoRA rank for MLX Qwen3 learning reranker
training.learning_reranker_lora_target_modules list[str] ["q_proj", "k_proj", "v_proj", "o_proj"] min_length=1 Module name suffixes to apply LoRA to (MLX Qwen3)
training.learning_reranker_negative_ratio LEARNING_RERANKER_NEGATIVE_RATIO int 5 ≥ 1, ≤ 20 Negative pairs per positive during learning reranker training
training.learning_reranker_promote_epsilon LEARNING_RERANKER_PROMOTE_EPSILON float 0.0 ≥ 0.0, ≤ 1.0 Minimum improvement required to auto-promote (primary metric delta)
training.learning_reranker_promote_if_improves LEARNING_RERANKER_PROMOTE_IF_IMPROVES int 1 ≥ 0, ≤ 1 Promote trained learning artifact to active path only if primary metric improves
training.learning_reranker_telemetry_interval_steps LEARNING_RERANKER_TELEMETRY_INTERVAL_STEPS int 2 ≥ 1, ≤ 20 Emit trainer telemetry every N optimizer steps (plus first/final)
training.learning_reranker_unload_after_sec LEARNING_RERANKER_UNLOAD_AFTER_SEC int 0 ≥ 0, ≤ 86400 Unload MLX learning reranker model after idle seconds (0 = never)
training.ragweld_agent_backend RAGWELD_AGENT_BACKEND str "mlx_qwen3" Ragweld agent backend (in-process chat model). Currently: mlx_qwen3
training.ragweld_agent_base_model RAGWELD_AGENT_BASE_MODEL str "mlx-community/Qwen3-1.7B-4bit" Shipped base model for the ragweld agent (MLX).
training.ragweld_agent_grad_accum_steps RAGWELD_AGENT_GRAD_ACCUM_STEPS int 8 ≥ 1, ≤ 128 Gradient accumulation steps per optimizer update for ragweld agent training.
training.ragweld_agent_lora_alpha RAGWELD_AGENT_LORA_ALPHA float 32.0 > 0.0, ≤ 512.0 LoRA alpha for ragweld agent MLX fine-tuning.
training.ragweld_agent_lora_dropout RAGWELD_AGENT_LORA_DROPOUT float 0.05 ≥ 0.0, ≤ 0.5 LoRA dropout for ragweld agent MLX fine-tuning.
training.ragweld_agent_lora_rank RAGWELD_AGENT_LORA_RANK int 16 ≥ 1, ≤ 128 LoRA rank for ragweld agent MLX fine-tuning.
training.ragweld_agent_lora_target_modules list[str] ["q_proj", "k_proj", "v_proj", "o_proj"] min_length=1 Module name suffixes to apply LoRA to (ragweld agent; MLX Qwen3).
training.ragweld_agent_model_path RAGWELD_AGENT_MODEL_PATH str "models/learning-agent-epstein-files-1" Active ragweld agent adapter artifact path (directory containing adapter.npz + adapter_config.json).
training.ragweld_agent_promote_epsilon RAGWELD_AGENT_PROMOTE_EPSILON float 0.0 ≥ 0.0, ≤ 10.0 Minimum eval_loss improvement required to auto-promote (baseline_loss - new_loss >= epsilon).
training.ragweld_agent_promote_if_improves RAGWELD_AGENT_PROMOTE_IF_IMPROVES int 1 ≥ 0, ≤ 1 Auto-promote trained ragweld agent adapter only if eval_loss improves.
training.ragweld_agent_reload_period_sec RAGWELD_AGENT_RELOAD_PERIOD_SEC int 60 ≥ 0, ≤ 600 Adapter reload check period (seconds). 0 = check every request.
training.ragweld_agent_telemetry_interval_steps RAGWELD_AGENT_TELEMETRY_INTERVAL_STEPS int 2 ≥ 1, ≤ 20 Emit ragweld agent trainer telemetry every N optimizer steps (plus first/final).
training.ragweld_agent_train_dataset_path RAGWELD_AGENT_TRAIN_DATASET_PATH str "" Training dataset path for the ragweld agent (empty = use evaluation.eval_dataset_path).
training.ragweld_agent_unload_after_sec RAGWELD_AGENT_UNLOAD_AFTER_SEC int 0 ≥ 0, ≤ 86400 Unload ragweld agent model after idle seconds (0 = never).
training.reranker_train_batch RERANKER_TRAIN_BATCH int 16 ≥ 1, ≤ 128 Training batch size
training.reranker_train_epochs RERANKER_TRAIN_EPOCHS int 2 ≥ 1, ≤ 20 Training epochs for reranker
training.reranker_train_lr RERANKER_TRAIN_LR float 2e-05 ≥ 1e-06, ≤ 0.001 Learning rate
training.reranker_warmup_ratio RERANKER_WARMUP_RATIO float 0.1 ≥ 0.0, ≤ 0.5 Warmup steps ratio
training.tribrid_reranker_mine_mode TRIBRID_RERANKER_MINE_MODE str "replace" pattern=^(replace|append)$ Triplet mining mode
training.tribrid_reranker_mine_reset TRIBRID_RERANKER_MINE_RESET int 0 ≥ 0, ≤ 1 Reset triplets file before mining
training.tribrid_reranker_model_path TRIBRID_RERANKER_MODEL_PATH str "models/learning-reranker-epstein-files-1" Active learning reranker artifact path (MLX adapter directory).
training.tribrid_triplets_path TRIBRID_TRIPLETS_PATH str "data/training/triplets__epstein-files-1.jsonl" Training triplets file path
training.triplets_min_count TRIPLETS_MIN_COUNT int 100 ≥ 10, ≤ 10000 Min triplets for training
training.triplets_mine_mode TRIPLETS_MINE_MODE str "replace" pattern=^(replace|append)$ Triplet mining mode

Details (glossary)

training.learning_reranker_backend (LEARNING_RERANKER_BACKEND) — Learning Reranker Backend

Category: reranking

Backend used when RERANKER_MODE="learning". auto selects the MLX Qwen3 LoRA backend (Apple Silicon). mlx_qwen3 forces the same MLX backend. Legacy values (transformers/hf) are accepted for backward compatibility but normalize to auto.

Badges: - Backend selector

training.learning_reranker_base_model (LEARNING_RERANKER_BASE_MODEL) — Learning Reranker Base Model

Category: reranking

Base model identifier to fine-tune FROM when using the MLX Qwen3 learning backend (e.g. Qwen/Qwen3-Reranker-0.6B). Training produces a LoRA adapter artifact written under TRIBRID_RERANKER_MODEL_PATH and inference loads that adapter on top of this base. Changing the base model makes existing adapters incompatible.

Badges: - MLX only

training.learning_reranker_grad_accum_steps (LEARNING_RERANKER_GRAD_ACCUM_STEPS) — Learning Reranker Grad Accum Steps

Category: reranking

Number of micro-batches to accumulate gradients over before applying one optimizer update when training the MLX Qwen3 learning reranker. This increases effective batch size without increasing memory. Typical: 4–16; default 8.

Badges: - MLX only - Affects training speed

training.learning_reranker_lora_alpha (LEARNING_RERANKER_LORA_ALPHA) — Learning Reranker LoRA Alpha

Category: reranking

LoRA scaling (alpha) for MLX Qwen3 fine-tuning. Effective LoRA scale is alpha/rank. Typical: 16–64; start at 32.

Badges: - MLX only

training.learning_reranker_lora_dropout (LEARNING_RERANKER_LORA_DROPOUT) — Learning Reranker LoRA Dropout

Category: reranking

LoRA dropout probability for MLX Qwen3 fine-tuning. Small dropout (0.0–0.1) can reduce overfitting on small mined datasets. Start at 0.05.

Badges: - MLX only

training.learning_reranker_lora_rank (LEARNING_RERANKER_LORA_RANK) — Learning Reranker LoRA Rank

Category: reranking

LoRA rank (r) for MLX Qwen3 fine-tuning. Higher rank increases adapter capacity (and training/inference cost) and can improve quality with enough data. Typical: 8–32; start at 16.

Badges: - MLX only - Affects training cost

training.learning_reranker_negative_ratio (LEARNING_RERANKER_NEGATIVE_RATIO) — Learning Reranker Negative Ratio

Category: reranking

When converting mined triplets into labeled (query, document) pairs for pairwise training, use up to this many negatives per positive. Higher ratios can improve discrimination but increase training time. Typical: 3–5; default 5.

Badges: - Quality vs cost

training.learning_reranker_promote_epsilon (LEARNING_RERANKER_PROMOTE_EPSILON) — Learning Reranker Promotion Epsilon

Category: reranking

Minimum dev-metric improvement required to auto-promote a newly trained learning reranker artifact over the active baseline. Use a small epsilon (e.g., 0.002) to avoid promoting on noise.

Badges: - Prevents noise promotions

training.learning_reranker_promote_if_improves (LEARNING_RERANKER_PROMOTE_IF_IMPROVES) — Learning Reranker Promotion Gate

Category: reranking

If enabled (1), training only promotes the newly trained artifact to TRIBRID_RERANKER_MODEL_PATH if the primary dev metric improves over the active baseline by at least LEARNING_RERANKER_PROMOTE_EPSILON. If disabled (0), successful training always promotes.

Badges: - Safety

training.learning_reranker_telemetry_interval_steps (LEARNING_RERANKER_TELEMETRY_INTERVAL_STEPS) — Learning Reranker Telemetry Interval Steps

Category: reranking

Emit trainer telemetry every N optimizer steps (plus first and final events). Default: 2. Range: 1-20. Lower values give smoother live charts but increase event volume.

training.learning_reranker_unload_after_sec (LEARNING_RERANKER_UNLOAD_AFTER_SEC) — Learning Reranker Idle Unload

Category: reranking

If >0, unload the MLX Qwen3 reranker model from memory after this many seconds of inactivity. Set to 0 to keep the model resident (faster first rerank, higher memory use).

Badges: - MLX only - Affects latency

training.ragweld_agent_backend (RAGWELD_AGENT_BACKEND) — RAGWELD_AGENT_BACKEND

Category: general

No detailed tooltip available yet.

training.ragweld_agent_base_model (RAGWELD_AGENT_BASE_MODEL) — RAGWELD_AGENT_BASE_MODEL

Category: general

No detailed tooltip available yet.

training.ragweld_agent_grad_accum_steps (RAGWELD_AGENT_GRAD_ACCUM_STEPS) — RAGWELD_AGENT_GRAD_ACCUM_STEPS

Category: general

No detailed tooltip available yet.

training.ragweld_agent_lora_alpha (RAGWELD_AGENT_LORA_ALPHA) — RAGWELD_AGENT_LORA_ALPHA

Category: general

No detailed tooltip available yet.

training.ragweld_agent_lora_dropout (RAGWELD_AGENT_LORA_DROPOUT) — RAGWELD_AGENT_LORA_DROPOUT

Category: general

No detailed tooltip available yet.

training.ragweld_agent_lora_rank (RAGWELD_AGENT_LORA_RANK) — RAGWELD_AGENT_LORA_RANK

Category: general

No detailed tooltip available yet.

training.ragweld_agent_model_path (RAGWELD_AGENT_MODEL_PATH) — RAGWELD_AGENT_MODEL_PATH

Category: general

No detailed tooltip available yet.

training.ragweld_agent_promote_epsilon (RAGWELD_AGENT_PROMOTE_EPSILON) — RAGWELD_AGENT_PROMOTE_EPSILON

Category: general

No detailed tooltip available yet.

training.ragweld_agent_promote_if_improves (RAGWELD_AGENT_PROMOTE_IF_IMPROVES) — RAGWELD_AGENT_PROMOTE_IF_IMPROVES

Category: general

No detailed tooltip available yet.

training.ragweld_agent_telemetry_interval_steps (RAGWELD_AGENT_TELEMETRY_INTERVAL_STEPS) — RAGWELD_AGENT_TELEMETRY_INTERVAL_STEPS

Category: general

No detailed tooltip available yet.

training.ragweld_agent_train_dataset_path (RAGWELD_AGENT_TRAIN_DATASET_PATH) — RAGWELD_AGENT_TRAIN_DATASET_PATH

Category: general

No detailed tooltip available yet.

training.reranker_train_batch (RERANKER_TRAIN_BATCH) — Training Batch Size

Category: embedding

Batches per gradient step during training. Larger batch sizes stabilize training but require more memory. For Colima or small GPUs/CPUs, use 1–4. If you see the container exit with code -9 (OOM), reduce this value.

Badges: - Lower = safer on Colima

Links: - Memory Tips (HF) - Colima Resources

training.reranker_train_epochs (RERANKER_TRAIN_EPOCHS) — Training Epochs

Category: reranking

Number of full passes over the training triplets for the learning reranker. More epochs can improve quality but risk overfitting when data is small. Start with 1–2 and increase as your mined dataset grows.

Badges: - Quality vs overfit

training.reranker_train_lr (RERANKER_TRAIN_LR) — Training Learning Rate

Category: reranking

Learning rate for learning-reranker optimization during fine-tuning. This controls the size of weight updates during gradient descent. Typical range is 1e-6 to 5e-5. Higher values (3e-5 to 5e-5) converge faster but can destabilize training; lower values (1e-6 to 1e-5) are safer but slower.

Sweet spot: 2e-5 for most runs. Use 1e-5 for smaller triplet sets or unstable loss curves, and 3e-5 for larger datasets with steady validation metrics.

Combine with RERANKER_WARMUP_RATIO so early steps ramp smoothly from 0 to target LR.

• Typical range: 1e-6 to 5e-5 • Conservative: 1e-5 • Balanced default: 2e-5 • Aggressive: 3e-5 to 5e-5 • Too high: Loss spikes, NaN values, divergence • Too low: Slow convergence, limited gains

Badges: - Advanced ML training - Requires tuning

Links: - Learning Rate Explained - Learning Rate Schedules

training.reranker_warmup_ratio (RERANKER_WARMUP_RATIO) — Warmup Ratio

Category: reranking

Fraction of total training steps to use for linear learning-rate warmup. During warmup, LR ramps from 0 to RERANKER_TRAIN_LR to reduce early instability; after warmup, LR follows its normal schedule.

Sweet spot: 0.1 (10%). For short runs (<500 steps), 0.05-0.08 is often enough. For long runs (>1000 steps), 0.1-0.15 can improve stability.

• No warmup: 0.0 • Short training: 0.05-0.08 • Balanced default: 0.1 • Long training: 0.15-0.2 • Effect: Stabilizes early updates and reduces divergence risk

Badges: - Advanced ML training - Stabilizes training

Links: - Warmup Schedules - Learning Rate Warmup Paper - Scheduler Visualization

training.tribrid_reranker_mine_mode (TRIBRID_RERANKER_MINE_MODE) — Triplet Mining Mode

Category: general

Strategy for mining training triplets: random, semi‑hard, or hard negatives. Harder negatives improve discriminative power but may be noisier and slower to mine.

Badges: - Advanced

Links: - Hard Negative Mining

training.tribrid_reranker_mine_reset (TRIBRID_RERANKER_MINE_RESET) — Reset Triplets Before Mining

Category: general

If enabled, deletes existing mined triplets before starting a new mining run. Use with caution to avoid losing curated datasets.

Badges: - Destructive

training.tribrid_reranker_model_path (TRIBRID_RERANKER_MODEL_PATH) — Reranker Model Path

Category: general

Filesystem path to the active learning reranker artifact (relative paths recommended). For the MLX Qwen3 learning reranker this is the active LoRA adapter directory. The service loads from this path on startup or when reloaded.

Links: - Model Checkpoints

training.tribrid_triplets_path (TRIBRID_TRIPLETS_PATH) — Triplets Dataset Path

Category: general

Path to the JSONL triplets dataset used for learning reranker mining and training. Default: data/training/triplets__epstein-files-1.jsonl. Keep this in durable storage for reproducible experiments.

Links: - Triplet Loss

training.triplets_min_count (TRIPLETS_MIN_COUNT) — Triplets Min Count

Category: general

Minimum mined triplets required before training starts. Default: 100. Range: 10-10000. If training skips for insufficient data, mine more triplets or lower this for experimentation.

Badges: - Data quality gate - Production needs 500+

Links: - Triplet Loss for Ranking - Hard Negative Mining - Triplet Mining in RAG (ACL 2025) - Learning to Rank

training.triplets_mine_mode (TRIPLETS_MINE_MODE) — Triplets Mine Mode

Category: general

How mined triplets are written: replace (overwrite dataset) or append (add to existing file). Default: replace. Use append for incremental collection; use replace for clean, reproducible runs.

Badges: - Advanced training control - Use semi-hard for production

Links: - Hard Negative Mining - Negative Sampling Strategies - Triplet Mining (ACL 2025)