Hybrid Retrieval: Combining Vectors and Graphs

Hybrid Retrieval: Combining Vectors and Graphs
Hybrid retrieval: combining vectors and graphs.

Hybrid Retrieval: Combining Vectors and Graphs

Series: Graph RAG | Part: 7 of 10

Neither vector search nor graph traversal alone is sufficient for production AI agents.

Vector search casts a wide net—it finds semantically similar content even when terminology varies. But it misses structured relationships and fails at multi-hop reasoning.

Graph traversal follows precise relationships—it captures dependencies, hierarchies, and causal chains. But it requires exact matches to starting entities and can't handle semantic ambiguity.

The solution isn't choosing one or the other. It's combining them.

Hybrid retrieval uses vector similarity to find starting points, then graph traversal to explore relationships, then vector ranking to score results. It's the best of both approaches: the recall of vector search with the precision of graph structure.

This is how production Graph RAG systems actually work.


The Failure Modes of Pure Approaches

Strength: Recall through semantic similarity

  • Handles paraphrase and terminology variation
  • Works when you don't know exact entity names
  • Covers long tail of related but not explicitly mentioned concepts

Weakness: No structural reasoning

  • Can't follow dependency chains
  • Misses transitive relationships
  • Conflates semantic similarity with logical connection

Example failure:
Query: "What services would break if we remove the auth system?"
Vector search finds documents mentioning auth and services, but can't traverse the dependency graph to find indirect dependencies.

Pure Graph Traversal

Strength: Precise structural reasoning

  • Follows relationship chains correctly
  • Captures transitive dependencies
  • Respects relationship types and directionality

Weakness: Requires exact entity matches

  • Fails if user query doesn't match entity names
  • Can't handle semantic similarity
  • Brittle to terminology variation

Example failure:
Query: "What depends on the login service?"
Graph has an entity "Authentication Service"—not "login service." Pure graph traversal finds nothing because the starting entity doesn't match.


The Hybrid Architecture

Stage 1: Vector-Based Entity Grounding

Use vector similarity to map the user's query to graph entities:

Query: "What happens if we remove the login system?"

1. Embed the query
2. Embed all entity labels + descriptions
3. Find top-K entities by cosine similarity
   → "Authentication Service" (0.92)
   → "Session Management" (0.78)
   → "User Login API" (0.71)

This step grounds the query in the knowledge graph. Even if the user says "login system" and the graph has "Authentication Service," vector similarity bridges the gap.

Key insight: Vector similarity is excellent for entity linking. It handles synonyms, abbreviations, and paraphrase naturally.

Stage 2: Graph-Based Relationship Exploration

Once you have starting entities, traverse the graph to find connected information:

Starting entity: Authentication Service

Graph traversal:
1. Find all services with DEPENDS_ON edges pointing to Authentication Service
2. Find all configurations with USES edges
3. Find all documentation with DESCRIBES edges
4. Optionally: follow transitive dependencies (services depending on services that depend on auth)

This step provides structure. You're not searching for similar text—you're following explicit relationships encoded in the graph.

Key insight: Graph traversal gives you structured paths: entity → relationship → entity. This is what enables multi-hop reasoning.

Stage 3: Vector-Based Result Ranking

You have a subgraph of relevant entities and relationships. Now retrieve the actual text and rank by relevance:

Retrieved subgraph:
- Authentication Service
- 12 dependent services
- 5 configuration files
- 8 documentation pages

For each retrieved entity:
1. Fetch associated text chunks
2. Compute vector similarity to original query
3. Rank by combined score: graph distance + vector similarity

This step ensures the final results are semantically relevant to the user's intent, not just structurally connected.

Key insight: Vector similarity provides final ranking. Not all connections are equally relevant—surface the ones semantically closest to the query.


Scoring and Fusion Strategies

How do you combine graph distance and vector similarity into a single relevance score?

Weighted Linear Combination

score = α × vector_similarity + (1-α) × graph_relevance

where:
  vector_similarity = cosine(query_embedding, chunk_embedding)
  graph_relevance = 1 / (1 + path_distance)

Tune α based on your use case:

  • α = 0.7: Prioritize semantic similarity (good for exploratory queries)
  • α = 0.3: Prioritize graph structure (good for dependency analysis)

Reciprocal Rank Fusion

Rank results separately by vector similarity and graph distance, then combine ranks:

RRF_score = Σ (1 / (k + rank_i))

where:
  rank_i = position in ranked list from method i
  k = constant (usually 60)

This approach doesn't require tuning weights—it treats both signals as equally important and fuses their rankings.

Graph-Weighted Vector Scores

Use graph structure to boost semantically similar results that are also well-connected:

score = vector_similarity × (1 + log(degree(entity)))

where:
  degree(entity) = number of edges connected to the entity

High-degree entities (central nodes in the graph) get a boost. This surfaces well-connected concepts that are also semantically relevant.


Query Patterns and Hybrid Strategies

Different query types benefit from different hybrid strategies:

Exploratory Queries: "What is X?"

Strategy: Vector-heavy with graph expansion

  1. Vector search to find documents about X
  2. Light graph traversal to find directly related entities
  3. Vector rank results

The query is about content, not structure. Use graphs to expand slightly beyond exact matches, but prioritize semantic relevance.

Dependency Queries: "What depends on X?"

Strategy: Graph-heavy with vector grounding

  1. Vector search to identify X in the graph
  2. Graph traversal to follow DEPENDS_ON edges
  3. Return structured results with minimal vector ranking

The answer is structural. Vector search only grounds the starting entity—everything else is pure graph traversal.

Synthesis Queries: "How do X and Y relate?"

Strategy: Balanced hybrid

  1. Vector search to identify entities X and Y
  2. Graph traversal to find paths connecting them
  3. Vector rank paths by semantic relevance to original query

You need structure (the path) and semantics (which path best matches the user's intent).

Impact Analysis: "What happens if we change X?"

Strategy: Graph traversal with vector filtering

  1. Vector search to identify X
  2. Graph traversal to find transitive dependencies
  3. Vector filter results to those mentioning relevant impact terms (failure, break, affect)

Structure drives the answer (what's connected) but semantics filters noise (only surface entities where the documentation discusses impact).


Implementation Patterns

Pattern 1: Two-Database Architecture

Store vectors and graph separately:

  • Vector database (Pinecone, Weaviate, Qdrant): entity embeddings, chunk embeddings
  • Graph database (Neo4j, TigerGraph, Neptune): entities, relationships, structure

Query pipeline:

  1. Vector DB: Find entities similar to query
  2. Graph DB: Traverse from those entities
  3. Vector DB: Rank retrieved chunks
  4. Combine and return

Tradeoff: Two databases means more infrastructure, but each is optimized for its task.

Pattern 2: Unified Embedding-Augmented Graph

Store both graph structure and embeddings in a single graph database:

Node: Authentication Service
Properties:
  - name: "Authentication Service"
  - embedding: [0.23, -0.45, 0.67, ...]
  - description: "Handles user login and session management"

Edges: DEPENDS_ON, CALLS, CONFIGURED_BY

Query directly in graph DB using both structural queries and vector similarity.

Tradeoff: Simpler infrastructure but graph databases aren't optimized for vector similarity search—slower than dedicated vector DBs.

Pattern 3: Vector Index Over Graph Projections

Build vector embeddings for subgraphs or paths:

For each entity:
  Embed(entity description + immediate neighbors + relationship types)

This creates "contextual embeddings" that include graph structure. Vector search now captures some structural information.

Tradeoff: Less precise than pure graph traversal but enables fast approximate structural search.


User query: "How do I configure rate limiting for the API gateway?"

Hybrid pipeline:

1. Vector search over entity embeddings:
   Query: "rate limiting API gateway configuration"
   Results:
   - API Gateway Service (0.89)
   - Rate Limiting Module (0.84)
   - Configuration Management (0.72)

2. Graph traversal from top entities:
   API Gateway Service -[USES]-> Rate Limiting Module
   Rate Limiting Module -[CONFIGURED_BY]-> RateLimitConfig
   RateLimitConfig -[DOCUMENTED_IN]-> ConfigGuide.md

3. Vector search over retrieved documents:
   Query: "configure rate limiting"
   Rank chunks from:
   - ConfigGuide.md
   - API Gateway documentation
   - Rate Limiting module docs

4. Assemble final answer:
   - Graph structure: API Gateway → uses → Rate Limiting → configured by → RateLimitConfig
   - Supporting text: relevant chunks from ConfigGuide.md
   - Generate response citing both structure and documentation

The vector search ensures we find the right starting point ("API Gateway Service") even though the user said "API gateway."

The graph traversal surfaces the configuration relationship that might not appear in any single document.

The final vector ranking ensures the returned documentation chunks are the most relevant sections, not just any mention of rate limiting.


When Hybrid Matters Most

Hybrid retrieval becomes essential when:

  • Terminology varies: Users describe entities differently than documentation does
  • Information is distributed: No single document contains the complete answer
  • Structure drives correctness: Dependencies, hierarchies, and relationships define the answer
  • Semantic nuance matters: Not all structurally connected entities are equally relevant

These are precisely the conditions in enterprise knowledge management, technical documentation, research synthesis, and customer support—the domains where Graph RAG sees production use.


The Engineering Challenge

Building hybrid retrieval is more complex than pure vector search:

  • Two or three databases to maintain
  • Synchronization between vector and graph stores
  • Tuning weights and fusion strategies
  • Monitoring both vector and graph query performance

But the complexity is justified. Hybrid retrieval answers questions that pure approaches can't. It's not a nice-to-have—it's the only viable architecture for AI agents that need to reason about structured knowledge.


Further Reading

  • Xiong, C. et al. (2020). "Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval." ICLR 2021.
  • Yasunaga, M. et al. (2021). "QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering." NAACL 2021.
  • Craswell, N. et al. (2020). "Overview of the TREC 2020 Deep Learning Track." TREC 2020.

This is Part 7 of the Graph RAG series, exploring how knowledge graphs solve the limitations of naive vector retrieval.

Previous: Microsoft GraphRAG: Architecture and Lessons Learned
Next: Graph RAG at Scale: Production Engineering Challenges