Skip to content

Config reference: enrichment

  • Enterprise tuning surface


    Defaults + constraints are rendered directly from Pydantic.

  • Env keys when available


    Many fields have an env-style alias (from TriBridConfig.to_flat_dict()).

  • Tooltip-level guidance


    If a matching glossary entry exists, you’ll see deeper tuning notes.

Config reference Config API & workflow Glossary

Total parameters: 6

Group index
  • (root)

(root)

JSON key Env key(s) Type Default Constraints Summary
enrichment.chunk_summaries_enrich_default CHUNK_SUMMARIES_ENRICH_DEFAULT int 1 ≥ 0, ≤ 1 Enable chunk_summary enrichment by default
enrichment.chunk_summaries_max CHUNK_SUMMARIES_MAX int 100 ≥ 10, ≤ 1000 Max chunk_summaries to generate
enrichment.enrich_code_chunks ENRICH_CODE_CHUNKS int 1 ≥ 0, ≤ 1 Enable chunk enrichment
enrichment.enrich_max_chars ENRICH_MAX_CHARS int 1000 ≥ 100, ≤ 5000 Max chars for enrichment prompt
enrichment.enrich_min_chars ENRICH_MIN_CHARS int 50 ≥ 10, ≤ 500 Min chars for enrichment
enrichment.enrich_timeout ENRICH_TIMEOUT int 30 ≥ 5, ≤ 120 Enrichment timeout (seconds)

Details (glossary)

enrichment.chunk_summaries_enrich_default (CHUNK_SUMMARIES_ENRICH_DEFAULT) — Chunk Summaries Enrich Default

Category: general

Controls whether chunk summaries are generated with richer, model-assisted metadata by default. Enriched summaries can add intent, entities, API surface hints, and semantic cues that improve retrieval and reranking beyond raw embeddings alone. The trade-off is higher indexing cost and longer build times, especially on large repositories. Enable enrichment when search quality and explainability matter more than ingestion speed, and disable it for rapid iteration pipelines where you need frequent low-cost reindexing.

Badges: - Metadata quality

Links: - Code-Craft Summarization (2025) - OpenAI Summarization Cookbook - LlamaIndex Vector Store Index - LangChain Retrieval Concepts

enrichment.chunk_summaries_max (CHUNK_SUMMARIES_MAX) — Max Chunk Summaries

Category: general

Caps how many chunk summaries are produced for a corpus. This is a budget control over indexing cost, storage footprint, and retrieval metadata coverage. A low cap is fast but can miss important modules, while a high cap improves coverage and long-tail recall at the cost of longer ingestion and larger indexes. Choose this value based on corpus size and criticality, then validate with retrieval benchmarks so the limit reflects actual answer quality rather than arbitrary round numbers.

Badges: - Coverage budget

Links: - MIRAGE Benchmark (2025) - T2-RAGBench (2025) - OpenAI Summarization Cookbook - LlamaIndex Vector Store Index

enrichment.enrich_code_chunks (ENRICH_CODE_CHUNKS) — Enrich Code Chunks

Category: chunking

When enabled, each code chunk is augmented with model-generated summaries or semantic descriptors during indexing. This often improves conceptual retrieval because rerankers can match intent signals beyond literal token overlap. The tradeoff is extra indexing time, compute cost, and the risk of noisy metadata if prompts or models are weak. Chunk size and model selection both matter: oversized chunks produce vague summaries, while tiny chunks lose architectural context. Evaluate this feature with task-based retrieval metrics to confirm the added metadata improves real query outcomes.

Badges: - Slower indexing

Links: - EyeLayer: Human Attention for Code Summarization (arXiv 2026) - Meta-RAG on Large Codebases Using Code Summarization (arXiv 2025) - LlamaIndex Repository - LangChain Repository