Tracing (local store and external routing)

Local trace store

In-memory tracing with retention control. Safe to reuse run_id across retries; latest always wins.
External tracing compatible

Route events to external systems (for example, LangSmith-style routing) when you outgrow local traces.
Deterministic “latest” lookups

Query by repo_id (corpus) or run_id to fetch the most recent trace; responses are detached copies.
Operator-grade knobs

Control tracing_mode, retention, and routing via Pydantic config. If you’re not sure, start with local + default retention.

Configuration Tracing config reference Observability API health & metrics

API first, MCP second

ragweld integrates tracing into the API-first lifecycle. You can layer MCP-based tooling on top, but your production contract remains the HTTP API mounted under /api.

What lives where (mental model)

flowchart LR
  A["UI / Agent Runner"] --> B["/api/*"]
  B --> C["Trace events"]
  C --> D["Local Store\\n(in-memory, retained)"]
  C --> X["External Trace\\n('routing mode')"]
  D --> E["Latest by 'repo_id' or 'run_id'"]

Local store: super-fast in-memory traces with retention.
External route: forward/duplicate events to an external tracer when enabled (for example, LangSmith routing mode).
Latest lookups: fetch most recent trace by repo_id (corpus) or by run_id.

Modes and core settings

Definition list of the knobs that matter:

tracing.tracing_mode: Which tracer backend to use. - local: keep traces in-process/memory with retention - external: route to a compatible external tracer (for example, LangSmith-style routing) - If you’re not sure, choose local.
tracing.trace_retention: Upper bound on how many traces are retained locally. When the limit is exceeded, oldest traces are evicted. See the Tracing config reference for the default and constraints.

Safe defaults

Start with tracing_mode = "local" and the default retention. Bump retention after you baseline memory headroom in production.

Local store semantics (what to expect)

ragweld’s local trace store prioritizes correctness during retries and simplicity at read time:

Run ID reuse is safe
If a caller reuses the same run_id (a common retry pattern), the store first removes any stale index references for that run_id and then records the new trace.
Why it matters: retention eviction can’t accidentally evict your just-started retry; the “latest” trace for that run_id is the one you just started.
Latest by repo_id or by run_id
latest(repo_id=…): returns the most recent trace in that corpus (remember: the codebase uses repo_id for corpus separation).
latest(run_id=…): returns the most recent trace matching that run.
Detached copies
Calls that return a trace object return a detached copy, not a live pointer into the store. Mutating it won’t mutate the store.
Retention-driven eviction
When the retention cap is reached, the oldest traces are removed. This happens across all repos, and is applied after any run-id de-duplication described above.

Terminology — corpus vs repo_id

The API and internals use repo_id to mean “corpus id.” Plan and size traces per corpus. See Corpus vs repo_id for background.

Practical operations

If your orchestrator uses stable run IDs across retries, keep doing that. ragweld will preserve the newest retry for that run_id under retention pressure.
To investigate a user report:
Check Grafana first (system health).
Use the UI’s trace viewer or a “latest by repo” lookup to see what happened most recently in that corpus.
If you need longer history, either raise trace_retention or switch to an external tracer.

Quick checklist for local mode

tracing.tracing_mode is set to local
Retention sized to your workload (start small, increase after profiling)
You understand that “latest” is per repo_id and per run_id
You are okay with in-memory only storage (export externally if you need durability)

Switching to external tracing

When you need persistent history, team collaboration, or advanced analytics, switch to external routing mode.

Task list:

Set tracing.tracing_mode = "external"
Configure the external tracer credentials and endpoint (provider-specific)
Verify events show up in the external system during a smoke test
Keep local retention modest; you’ll primarily use the external system for deep history

Provider specifics live in config

External tracer configuration is provider-dependent and defined in Pydantic models under server/models/tribrid_config_model.py. Don’t hand-edit docs — see the Tracing config reference and adjust via config.

Failure modes and how to avoid them

Retention set too low
Symptom: expected traces are missing when you open the viewer.
Fix: raise tracing.trace_retention or move to external mode.
Confusing corpus separation
Symptom: you’re looking at “latest by repo” but using the wrong repo_id.
Fix: confirm the corpus id (repo_id) you indexed and queried against are the same. See Corpus vs repo_id.
Assuming mutability on returned traces
Symptom: you “modify” a returned trace object and expect the store to reflect it.
Fact: returned traces are detached copies by design.

Example: behavior when reusing run_id (retries)

If an orchestrator restarts a run and reuses the same run_id, ragweld guarantees the newest run takes precedence in the indexes used for “latest” lookups. Concretely:

Any old index references to that run_id are removed before the new trace is recorded.
Retention eviction is computed after that de-duplication, so the new run won’t be evicted by stale references.

This lets you implement clean retry semantics without inventing new run_id values.

Why it’s implemented this way

Internally, the store maintains ordered deques for: - Global start order across all runs - Per-repo_id start order

Reusing a run_id first scrubs stale entries from these deques, then appends the new trace. The result is predictable “latest” lookups that point at the newest retry.

Where to look in the UI

Chat and RAG tabs generate trace events whenever a request flows through routing, retrieval, and generation.
The trace viewer surfaces the most recent trace per corpus, which is ideal for debugging a single user interaction.
For structured, long-term analysis, pair traces with Grafana dashboards and/or switch to external tracing.

API, URLs, and ports to remember

In dev, the backend is mounted under /api. Examples:
http://127.0.0.1:8012/api/search
fetch("/api/config")
Default dev entrypoints (unless overridden by env vars):
UI: http://127.0.0.1:5173/web
API: http://127.0.0.1:8012/api

Pydantic is the law