Config reference: scoring
-
Enterprise tuning surface
Defaults + constraints are rendered directly from Pydantic.
-
Env keys when available
Many fields have an env-style alias (from
TriBridConfig.to_flat_dict()). -
Tooltip-level guidance
If a matching glossary entry exists, you’ll see deeper tuning notes.
Config reference Config API & workflow Glossary
Total parameters: 5
Group index
(root)
(root)
| JSON key | Env key(s) | Type | Default | Constraints | Summary |
|---|---|---|---|---|---|
scoring.chunk_summary_bonus | CHUNK_SUMMARY_BONUS | float | 0.08 | ≥ 0.0, ≤ 1.0 | Bonus score for chunks matched via chunk_summary-based retrieval |
scoring.filename_boost_exact | FILENAME_BOOST_EXACT | float | 1.5 | ≥ 1.0, ≤ 5.0 | Score multiplier when filename exactly matches query terms |
scoring.filename_boost_partial | FILENAME_BOOST_PARTIAL | float | 1.2 | ≥ 1.0, ≤ 3.0 | Score multiplier when path components match query terms |
scoring.path_boosts | PATH_BOOSTS | str | "/gui,/server,/indexer,/retrieval" | — | Comma-separated path prefixes to boost |
scoring.vendor_mode | VENDOR_MODE | str | "prefer_first_party" | pattern=^(prefer_first_party|prefer_vendor|neutral)$ | Vendor code preference |
Details (glossary)
scoring.chunk_summary_bonus (CHUNK_SUMMARY_BONUS) — Chunk Summary Bonus
Category: retrieval
Additive weight applied after score fusion when a hit came from chunk-summary retrieval instead of raw chunk text. In practice this controls whether conceptual matches such as intent, behavior, or API purpose can compete with exact-token matches from code. Raise it when summaries are high quality but consistently rank below noisy lexical matches; lower it when vague summaries outrank precise chunks and hurt answer grounding. Tune this together with your fusion method and evaluation set, because the same numeric bonus has very different effects depending on score normalization and corpus size.
Badges: - Advanced tuning
Links: - cAST: Structural chunking for code RAG (arXiv 2025) - LangChain MultiVector Retriever - Elasticsearch Reciprocal Rank Fusion - Weaviate hybrid retrieval
scoring.filename_boost_exact (FILENAME_BOOST_EXACT) — Filename Exact Match Multiplier
Category: general
Applies a multiplier when query tokens exactly match a filename or full path component, which is especially effective for identifier-driven code search. Exact filename intent often indicates the user already knows the artifact, so this feature can sharply improve rank quality for navigational queries. Set the multiplier high enough to surface true exact hits, but not so high that semantic relevance is overridden for exploratory questions. Validate with a mixed benchmark containing both known-file and concept-search tasks.
Badges: - Lexical precision boost
Links: - Exp4Fuse Rank Fusion (arXiv) - Elasticsearch Term Query - Elasticsearch Multi Match Query - Lucene BM25Similarity
scoring.filename_boost_partial (FILENAME_BOOST_PARTIAL) — Path Component Partial Match Multiplier
Category: general
Applies a weaker multiplier for partial path or filename matches, helping fragment queries like auth or billing surface relevant areas of the codebase. Because substring matches are noisier than exact matches, this value should stay below exact filename boost and be tested against false-positive-heavy queries. Token boundary handling and minimum match length are important to avoid boosting accidental overlaps. This parameter is most effective when combined with semantic and sparse retrieval rather than used alone.
Badges: - Lexical recall boost
Links: - Exp4Fuse Rank Fusion (arXiv) - Elasticsearch Bool Query - Elasticsearch Dis Max Query - PostgreSQL Text Search Controls
scoring.path_boosts (PATH_BOOSTS) — Path Boosts
Category: retrieval
Adds deterministic ranking bonuses for files whose paths match configured prefixes (for example /api, /retrieval, or /infra). This is not a filter; candidates outside boosted paths can still win, but matching paths start with an intentional prior that reflects project structure and ownership patterns. In practice, path boosts are most useful when repositories contain large amounts of generated code, vendor trees, or historical directories that are semantically similar but operationally lower value. Tune this with offline evaluation and query logs: too much boost can hide genuinely relevant files, while too little leaves high-signal code regions under-ranked.
Links: - RANGER: Repository-Level Retrieval-Augmented Generation for Code Completion (arXiv 2025) - Elasticsearch Boosting Query - Elasticsearch Function Score Query - Vespa Ranking Framework
scoring.vendor_mode (VENDOR_MODE) — Vendor Mode
Category: general
Controls whether ranking heuristics prioritize first-party project code or third-party/vendor dependencies when scores are close. In large repos, vendor and framework code can dominate candidate lists simply because it is abundant; this setting counterbalances that effect for tasks where users primarily want answers about their own application logic. Prefer first-party mode for product debugging, architecture discovery, and onboarding into your codebase. Prefer vendor mode only when your query intent is explicitly about dependency internals. Evaluate with intent-labeled queries to confirm the mode aligns with expected navigation behavior.
Badges: - Code priority
Links: - SaraCoder: Repository-Aware Code Retrieval at Scale (arXiv 2025) - Sourcegraph Code Search Documentation - GitHub Code Search Overview - gitignore Patterns (vendor/exclusion hygiene)