Troubleshooting Guide¶
This page collects the most common problems I’ve seen while running AGRO and how to debug them quickly.
Use this as a “first pass” before diving into the code or filing an issue. When in doubt, remember that almost everything in AGRO is just:
- a FastAPI app
- a Pydantic‑validated config registry
- a set of small service modules under
server/services
If you know which layer is misbehaving, you can usually fix it in a few minutes.
1. Configuration & Environment Issues¶
AGRO’s behavior is driven by two main surfaces:
.env– infrastructure, secrets, and hard overridesagro_config.json– tunable RAG behavior, models, retrieval knobs
Under the hood, everything flows through the configuration registry in server/services/config_registry.py.
flowchart TD
A[.env file] -->|highest precedence| R[ConfigRegistry]
B[agro_config.json] -->|validated via Pydantic| R
C[Pydantic defaults] -->|fallback| R
R --> S[Services
(rag, indexing, editor,
keywords, etc.)]
If something “mysteriously” ignores your settings, the registry is the first place to look.
1.1 My .env changes aren’t taking effect¶
AGRO loads .env once, at import time, via python-dotenv:
| server/services/config_registry.py | |
|---|---|
Common failure modes:
- You edited the wrong
.env(e.g. host vs container) - You changed
.envbut didn’t restart the server - You’re setting a key that AGRO doesn’t actually read
Steps to debug:
- Confirm which
.envis being used
In Docker, the .env that matters is usually next to docker-compose.yml. On bare metal, it’s whatever is in the current working directory when you start uvicorn/server.app.
- Check the effective value via the config registry
Use the HTTP config API or the UI’s Admin → Settings view to inspect the value. Internally, everything goes through get_config_registry():
| server/services/config_registry.py | |
|---|---|
- Restart the backend
The registry is process‑local. If you change .env, you must restart the FastAPI process (and the indexer container if you’re running Docker).
1.2 agro_config.json validation errors¶
agro_config.json is validated by Pydantic models in server/models/agro_config_model.py. If the file is malformed or contains unknown keys, the registry will log a ValidationError and fall back to defaults.
Symptoms:
- UI loads, but your changes to
agro_config.jsondon’t seem to apply - Logs show something like
pydantic.ValidationErrorwhen starting the server
How to debug:
- Check the logs
Look for messages from the agro.config logger:
- Validate the file manually
Run a quick check in a Python shell:
| validate_config.py | |
|---|---|
If this raises, fix the offending field and retry.
- Use only known keys
The allowed keys are defined in AGRO_CONFIG_KEYS. Unknown keys are ignored by the registry layer that merges config for the web UI (server/services/config_store.py). If you typo a key, it simply won’t show up.
1.3 Environment vs config precedence confusion¶
The registry enforces a clear precedence:
.env(highest)agro_config.json- Pydantic defaults
There are also legacy aliases and a small set of infrastructure keys that must be overridable via environment variables.
Example: MQ_REWRITES is aliased to MAX_QUERY_REWRITES:
| server/services/config_registry.py | |
|---|---|
If you set MQ_REWRITES in .env and MAX_QUERY_REWRITES in agro_config.json, the .env value wins.
To see where a value came from, use the registry’s source tracking (exposed via the config API / UI). If you’re debugging in code, log both the value and its source.
2. Indexing Problems¶
Indexing is orchestrated by server/services/indexing.py. It shells out to the indexer using the same Python interpreter and passes a small environment block.
The web UI polls _INDEX_STATUS and _INDEX_METADATA to show progress.
2.1 Indexing never starts or hangs¶
Symptoms:
- Clicking “Index” in the UI shows “Indexing started…” but nothing else
- No new Qdrant collections or index files appear under
data/
Checklist:
- Check the configured repo
The indexer uses REPO from the config registry:
Make sure:
REPOpoints to a valid profile / repo name-
REPO_ROOT(in the environment) matches the actual checkout path -
Inspect indexer logs
The indexer runs as a separate process. If you’re using Docker, check the indexer container logs. On bare metal, look for logs under data/out/<repo>/logs or wherever you configured logging.
- Verify Python path
The indexer process is started with PYTHONPATH set to repo_root(). If you’ve moved the code or are running from a different working directory, imports inside the indexer may fail.
2.2 “Indexing failed: …” in the UI¶
If _INDEX_STATUS contains a line like Indexing failed: <error>, the exception was caught in run_index().
Steps:
- Open browser dev tools → Network and inspect the
/api/index/statusresponse to see the full_INDEX_STATUSlist. - Reproduce from the CLI using the same environment:
REPO=my-repo REPO_ROOT=/path/to/root PYTHONPATH=/path/to/root \
python cli/agro.py index --repo my-repo
This often gives a more complete traceback.
3. RAG / Search Issues¶
The HTTP search and chat endpoints ultimately call server/services/rag.py.
3.1 Empty or obviously wrong results¶
Before blaming embeddings or rerankers, check the simple stuff:
- BM25 only sanity check
For small repos, BM25 alone is often better than a misconfigured dense stack. In the UI, set the retrieval mode to “BM25 only” (or disable dense search in agro_config.json) and retry.
- Verify
FINAL_K/LANGGRAPH_FINAL_K
If FINAL_K is set too low, you may be seeing only a tiny slice of the candidate set. The code falls back to 10 if both keys are missing or invalid.
- Check discriminative keywords
AGRO supports discriminative / semantic keywords via server/services/keywords.py. If you’ve cranked KEYWORDS_BOOST or set KEYWORDS_MAX_PER_REPO to something extreme, BM25 scoring can get skewed.
The module caches config at import time:
If you change these at runtime, call reload_config() in that module or restart the server.
3.2 LangGraph errors or missing graph behavior¶
AGRO can optionally run a LangGraph‑based orchestration layer (server.langgraph_app). If build_graph() fails, the RAG service logs a warning and continues without a graph:
| server/services/rag.py | |
|---|---|
If you expect graph‑driven behavior (multi‑step tools, custom nodes) but don’t see it:
- Check logs for
build_graph failed - Import
server.langgraph_appin a REPL and callbuild_graph()manually to see the traceback. - If you’re iterating on the graph code, remember that
_graphis cached at module level; restart the server after changes.
4. Editor & DevTools Integration¶
AGRO ships with an embedded “editor” / devtools panel, controlled by server/services/editor.py.
4.1 Editor panel not showing up in the UI¶
The web UI checks read_settings() to decide whether to show the embedded editor.
- Ensure
EDITOR_ENABLED=truein.envoragro_config.json - If you’ve previously written
server/out/editor/settings.json, those values may override defaults; delete the file to reset to registry‑only behavior
4.2 Editor server not reachable¶
If the embedded editor runs as a separate process (e.g. a code‑server container), the UI needs to know where to find it:
EDITOR_PORT– port the editor listens onEDITOR_BIND–localvspublic(affects how URLs are constructed)
Check the DevTools network tab for failing requests to /editor/... and cross‑check with read_settings().
5. Traces & Evaluation¶
AGRO writes traces for RAG runs and evaluation under out/<repo>/traces. The service layer for listing and fetching traces lives in server/services/traces.py.
5.1 “No traces found” in the UI¶
If the Evaluation / Tracing tabs show no traces:
- Check the repo name
list_traces() uses the repo query param or falls back to REPO from the environment. If your UI is pointing at repo=agro but you indexed my-repo, you’ll see an empty list.
- Inspect the filesystem
Look under out/<repo>/traces (or whatever out_dir(repo) resolves to). If there are no .json files, tracing may be disabled or the RAG pipeline never wrote any traces.
- Check for exceptions in
list_traces
Any filesystem errors are logged via logger.exception. If you’re running inside a container, make sure the out/ directory is writable and mounted correctly.
5.2 “latest trace” endpoint fails¶
latest_trace(repo) wraps server.tracing.latest_trace_path and returns a small JSON payload. If latest_trace_path raises, the service logs and returns an empty result.
If the UI shows an error when loading the latest trace:
- Check logs for
latest_trace_path failed - Verify that at least one trace file exists under
out/<repo>/traces - Confirm that trace filenames follow the expected pattern (the helper usually looks for the newest
*.json)
6. File Writes & Docker Volume Quirks¶
When AGRO writes config or settings files from the API / UI, it uses an atomic write helper with a Docker‑specific fallback in server/services/config_store.py.
6.1 “Device or resource busy” when saving config¶
On Docker Desktop for macOS, os.replace() on a bind‑mounted file can intermittently fail with EBUSY if something is watching the file.
AGRO already retries and falls back to a non‑atomic write, but if you still see errors:
- Ensure the mount point is not being aggressively watched by external tools
- Consider moving
data/andout/to a Docker volume instead of a host bind mount
6.2 Secrets not persisting or showing up blank¶
Secrets (API keys, tokens) are treated specially:
- The set of secret field names is in
SECRET_FIELDS - When reading config for the UI, these values are redacted
- When writing, the API will avoid echoing them back
If you save a secret in the UI and then re‑open the page, seeing an empty field is expected. To verify persistence:
- Inspect the underlying config file on disk (e.g.
agro_config.jsonor the profile JSON underweb/public/profiles) - Or call the config API directly and check the raw JSON (outside the UI’s redaction logic)
7. When All Else Fails: Let AGRO Explain Itself¶
Two meta‑features are worth remembering when debugging:
- Config registry is indexed into AGRO’s own RAG
You can go to the Chat tab and ask things like:
“How does
KEYWORDS_AUTO_GENERATEwork?”
or
“Where is
FINAL_Kused in the retrieval pipeline?”
The system will pull from server/services, retrieval/, and the Pydantic models to answer.
- Every knob has documentation attached
The web UI surfaces tooltips for each parameter, often with links to the relevant arXiv papers or provider docs. If a setting behaves differently than you expect, hover it first; if that’s not enough, search for the key name in the repo.
8. Quick Triage Checklist¶
Use this as a fast path before deeper debugging:
- Config not applying
- Confirm
.envlocation and restart backend - Validate
agro_config.jsonwithAgroConfigRoot -
Check registry values via Admin → Settings
-
Indexing issues
- Verify
REPO/REPO_ROOT - Run
cli/agro.py indexmanually -
Inspect indexer logs /
data/contents -
Bad search results
- Try BM25‑only
- Check
FINAL_Kand keyword boosts -
Look for LangGraph warnings
-
Missing traces / evals
- Confirm
out/<repo>/tracesexists - Check
repoparam vsREPOenv - Look for exceptions in
server/services/traces.py