Skip to content

Config reference: scoring

  • Enterprise tuning surface


    Defaults + constraints are rendered directly from Pydantic.

  • Env keys when available


    Many fields have an env-style alias (from TriBridConfig.to_flat_dict()).

  • Tooltip-level guidance


    If a matching glossary entry exists, you’ll see deeper tuning notes.

Config reference Config API & workflow Glossary

Total parameters: 5

Group index
  • (root)

(root)

JSON key Env key(s) Type Default Constraints Summary
scoring.chunk_summary_bonus CHUNK_SUMMARY_BONUS float 0.08 ≥ 0.0, ≤ 1.0 Bonus score for chunks matched via chunk_summary-based retrieval
scoring.filename_boost_exact FILENAME_BOOST_EXACT float 1.5 ≥ 1.0, ≤ 5.0 Score multiplier when filename exactly matches query terms
scoring.filename_boost_partial FILENAME_BOOST_PARTIAL float 1.2 ≥ 1.0, ≤ 3.0 Score multiplier when path components match query terms
scoring.path_boosts PATH_BOOSTS str "/gui,/server,/indexer,/retrieval" Comma-separated path prefixes to boost
scoring.vendor_mode VENDOR_MODE str "prefer_first_party" pattern=^(prefer_first_party|prefer_vendor|neutral)$ Vendor code preference

Details (glossary)

scoring.chunk_summary_bonus (CHUNK_SUMMARY_BONUS) — Chunk Summary Bonus

Category: retrieval

Additive weight applied after score fusion when a hit came from chunk-summary retrieval instead of raw chunk text. In practice this controls whether conceptual matches such as intent, behavior, or API purpose can compete with exact-token matches from code. Raise it when summaries are high quality but consistently rank below noisy lexical matches; lower it when vague summaries outrank precise chunks and hurt answer grounding. Tune this together with your fusion method and evaluation set, because the same numeric bonus has very different effects depending on score normalization and corpus size.

Badges: - Advanced tuning

Links: - cAST: Structural chunking for code RAG (arXiv 2025) - LangChain MultiVector Retriever - Elasticsearch Reciprocal Rank Fusion - Weaviate hybrid retrieval

scoring.filename_boost_exact (FILENAME_BOOST_EXACT) — Filename Exact Match Multiplier

Category: general

Applies a multiplier when query tokens exactly match a filename or full path component, which is especially effective for identifier-driven code search. Exact filename intent often indicates the user already knows the artifact, so this feature can sharply improve rank quality for navigational queries. Set the multiplier high enough to surface true exact hits, but not so high that semantic relevance is overridden for exploratory questions. Validate with a mixed benchmark containing both known-file and concept-search tasks.

Badges: - Lexical precision boost

Links: - Exp4Fuse Rank Fusion (arXiv) - Elasticsearch Term Query - Elasticsearch Multi Match Query - Lucene BM25Similarity

scoring.filename_boost_partial (FILENAME_BOOST_PARTIAL) — Path Component Partial Match Multiplier

Category: general

Applies a weaker multiplier for partial path or filename matches, helping fragment queries like auth or billing surface relevant areas of the codebase. Because substring matches are noisier than exact matches, this value should stay below exact filename boost and be tested against false-positive-heavy queries. Token boundary handling and minimum match length are important to avoid boosting accidental overlaps. This parameter is most effective when combined with semantic and sparse retrieval rather than used alone.

Badges: - Lexical recall boost

Links: - Exp4Fuse Rank Fusion (arXiv) - Elasticsearch Bool Query - Elasticsearch Dis Max Query - PostgreSQL Text Search Controls

scoring.path_boosts (PATH_BOOSTS) — Path Boosts

Category: retrieval

Adds deterministic ranking bonuses for files whose paths match configured prefixes (for example /api, /retrieval, or /infra). This is not a filter; candidates outside boosted paths can still win, but matching paths start with an intentional prior that reflects project structure and ownership patterns. In practice, path boosts are most useful when repositories contain large amounts of generated code, vendor trees, or historical directories that are semantically similar but operationally lower value. Tune this with offline evaluation and query logs: too much boost can hide genuinely relevant files, while too little leaves high-signal code regions under-ranked.

Links: - RANGER: Repository-Level Retrieval-Augmented Generation for Code Completion (arXiv 2025) - Elasticsearch Boosting Query - Elasticsearch Function Score Query - Vespa Ranking Framework

scoring.vendor_mode (VENDOR_MODE) — Vendor Mode

Category: general

Controls whether ranking heuristics prioritize first-party project code or third-party/vendor dependencies when scores are close. In large repos, vendor and framework code can dominate candidate lists simply because it is abundant; this setting counterbalances that effect for tasks where users primarily want answers about their own application logic. Prefer first-party mode for product debugging, architecture discovery, and onboarding into your codebase. Prefer vendor mode only when your query intent is explicitly about dependency internals. Evaluate with intent-labeled queries to confirm the mode aligns with expected navigation behavior.

Badges: - Code priority

Links: - SaraCoder: Repository-Aware Code Retrieval at Scale (arXiv 2025) - Sourcegraph Code Search Documentation - GitHub Code Search Overview - gitignore Patterns (vendor/exclusion hygiene)