Version: 0.1.0

RAG Patterns

Retrieval-Augmented Generation (RAG) combines LLM generation with relevant context retrieved from your data. RaisinDB is uniquely suited for RAG because it combines vector search, a hierarchical content graph, and workspace isolation in a single system — no need to stitch together a vector database, a document store, and a graph database.

End-to-End RAG Workflow

User Query
    │
    ▼
Generate query embedding
    │
    ▼
Vector search (VECTOR_SEARCH)  ──►  Candidate nodes
    │
    ▼
Enrich with graph context       ──►  Related nodes via hierarchy/relations
    │
    ▼
Assemble prompt context
    │
    ▼
LLM generates answer
    │
    ▼
(Optional) Store answer as a node

Step 1: Store Knowledge as Nodes

Structure your knowledge base as a content hierarchy. Each piece of knowledge is a node with properties and an automatically generated embedding:

-- Create a knowledge base article
INSERT INTO 'knowledge' (name, path, node_type, properties) VALUES (
  'getting-started',
  '/docs/guides',
  'kb:Article',
  '{
    "title": "Getting Started Guide",
    "content": "RaisinDB is a multi-tenant content database with git-like versioning...",
    "author": "docs-team",
    "tags": ["introduction", "setup"],
    "status": "published"
  }'
);

When the node is created, RaisinDB automatically generates an embedding from the content (if an embedding provider is configured) and indexes it in the HNSW vector store.

Step 2: Chunk Long Content as Child Nodes

For long documents, split content into child nodes. RaisinDB's hierarchy makes this natural — chunks are children of the source document:

-- Parent document
INSERT INTO 'knowledge' (name, path, node_type, properties) VALUES (
  'architecture-overview',
  '/docs/architecture',
  'kb:Article',
  '{"title": "Architecture Overview", "content": "Introduction to the system architecture..."}'
);

-- Chunks as child nodes
INSERT INTO 'knowledge' (name, path, node_type, properties) VALUES (
  'chunk-1',
  '/docs/architecture/architecture-overview',
  'kb:Chunk',
  '{"content": "The storage layer uses RocksDB with 40+ column families...", "position": 1, "source_doc": "architecture-overview"}'
);

INSERT INTO 'knowledge' (name, path, node_type, properties) VALUES (
  'chunk-2',
  '/docs/architecture/architecture-overview',
  'kb:Chunk',
  '{"content": "The SQL engine parses queries through a multi-stage pipeline...", "position": 2, "source_doc": "architecture-overview"}'
);

Each chunk gets its own embedding, and VECTOR_SEARCH in Documents mode automatically deduplicates results by source document.

Step 3: Retrieval Query Patterns

Basic Vector Retrieval

Find the most relevant chunks for a user query:

-- $1 = embedding vector generated from the user's question
SELECT
  id,
  name,
  properties->>'content'::String AS content,
  properties->>'source_doc'::String AS source,
  __distance
FROM 'knowledge'
WHERE VECTOR_SEARCH(embedding, $1, 10)
  AND node_type = 'kb:Chunk'
ORDER BY __distance ASC

Scoped Retrieval by Workspace

Use workspaces to separate knowledge domains. A customer support bot searches the support workspace; an engineering bot searches engineering:

-- Support bot searches only support knowledge
SELECT id, properties->>'content'::String AS content, __distance
FROM 'support'
WHERE VECTOR_SEARCH(embedding, $1, 10)
ORDER BY __distance ASC

-- Engineering bot searches only engineering knowledge
SELECT id, properties->>'content'::String AS content, __distance
FROM 'engineering'
WHERE VECTOR_SEARCH(embedding, $1, 10)
ORDER BY __distance ASC

Scoped Retrieval by Path

Use the content hierarchy to scope retrieval to specific areas:

-- Only search within the API documentation
SELECT id, properties->>'content'::String AS content, __distance
FROM 'knowledge'
WHERE VECTOR_SEARCH(embedding, $1, 10)
  AND PATH_STARTS_WITH(path, '/docs/api/')
ORDER BY __distance ASC

Filtered Retrieval

Combine vector search with property filters:

-- Only retrieve published, recent content
SELECT id, properties->>'content'::String AS content, __distance
FROM 'knowledge'
WHERE VECTOR_SEARCH(embedding, $1, 10)
  AND properties->>'status'::String = 'published'
  AND node_type = 'kb:Article'
ORDER BY __distance ASC

Step 4: Enrich with Graph Context

This is where RaisinDB's content graph adds value beyond a flat vector store. After finding relevant chunks, traverse the hierarchy to gather related context:

Get Parent Document for a Chunk

-- After finding chunk-2 as relevant, get its parent article
SELECT id, properties->>'title'::String AS title, properties->>'content'::String AS content
FROM 'knowledge'
WHERE PARENT(path) = '/docs/architecture'
  AND node_type = 'kb:Article'

Get Sibling Chunks

-- Get all chunks from the same document for fuller context
SELECT properties->>'content'::String AS content, properties->>'position'::String AS position
FROM 'knowledge'
WHERE PATH_STARTS_WITH(path, '/docs/architecture/architecture-overview')
  AND node_type = 'kb:Chunk'
ORDER BY properties->>'position'::String ASC

Use graph queries to find related content:

-- Find documents related to a given document via Cypher
SELECT * FROM cypher('
  MATCH (source:Article {id: "architecture-overview"})-[:REFERENCES]->(related:Article)
  RETURN related.title, related.content
')

Combine Vector + Graph in a Single Pipeline

Vector search finds the top-k most relevant chunks
Parent traversal fetches the full source document for each chunk
Sibling retrieval gets surrounding chunks for context
Relation traversal finds linked/referenced documents
Assemble all context into the LLM prompt

Step 5: Assemble and Generate

With retrieved context, build the prompt for your LLM:

// In a RaisinDB function
async function handler(input) {
  // 1. Generate embedding for the user's question
  const queryEmbedding = await raisin.ai.embed(input.question);

  // 2. Vector search for relevant chunks
  const results = await raisin.sql.query(
    `SELECT id, properties->>'content'::String AS content, __distance
     FROM 'knowledge'
     WHERE VECTOR_SEARCH(embedding, $1, 5)
     ORDER BY __distance ASC`,
    [queryEmbedding]
  );

  // 3. Build context from results
  const context = results.map(r => r.content).join('\n\n');

  // 4. Call LLM with context
  const answer = await raisin.ai.generate({
    prompt: `Answer the question based on the following context:\n\n${context}\n\nQuestion: ${input.question}`,
    model: 'claude-sonnet-4-20250514'
  });

  return { answer: answer.text, sources: results.map(r => r.id) };
}

Step 6: Store Agent Outputs (Optional)

Store the generated answer as a node for future retrieval — your RAG system learns from its own answers:

INSERT INTO 'knowledge' (name, path, node_type, properties) VALUES (
  'answer-12345',
  '/answers/2026-03',
  'kb:Answer',
  '{
    "question": "How does the storage layer work?",
    "answer": "The storage layer uses RocksDB with 40+ column families...",
    "sources": ["chunk-1", "chunk-2"],
    "confidence": 0.94,
    "generated_at": "2026-03-31T12:00:00Z"
  }'
);

Advanced Patterns

Multi-Workspace RAG

Search across multiple knowledge domains and let the LLM synthesize:

-- Search support knowledge
SELECT 'support' AS source_workspace, properties->>'content'::String AS content, __distance
FROM 'support'
WHERE VECTOR_SEARCH(embedding, $1, 5)

UNION ALL

-- Search engineering knowledge
SELECT 'engineering' AS source_workspace, properties->>'content'::String AS content, __distance
FROM 'engineering'
WHERE VECTOR_SEARCH(embedding, $1, 5)

ORDER BY __distance ASC
LIMIT 10

RAG with Branch Isolation

Use agent branches to let a RAG agent build up knowledge over time without affecting the main branch:

-- Create a branch for the RAG agent's session
INSERT INTO 'raisin:branches' (name, from_branch) VALUES ('agent/rag-session-001', 'main');

-- Agent stores generated answers and extracted facts on its branch
-- These can be reviewed and merged later

Versioned Knowledge Base

Because RaisinDB tracks revisions, you can build a RAG system that answers questions about how things used to be:

-- What did the docs say at revision 50?
SELECT properties->>'content'::String AS content
FROM 'knowledge'
WHERE VECTOR_SEARCH(embedding, $1, 5)
  AND __revision = 50
ORDER BY __distance ASC

Next Steps

Embeddings and Vector Search — deeper dive into vector search configuration
Agent Memory with Branches — isolate RAG agents with branches
Function-Based Tool Use — build RAG pipelines as serverless functions

End-to-End RAG Workflow​

Step 1: Store Knowledge as Nodes​

Step 2: Chunk Long Content as Child Nodes​

Step 3: Retrieval Query Patterns​

Basic Vector Retrieval​

Scoped Retrieval by Workspace​

Scoped Retrieval by Path​

Filtered Retrieval​

Step 4: Enrich with Graph Context​

Get Parent Document for a Chunk​

Get Sibling Chunks​

Traverse Related Documents​

Combine Vector + Graph in a Single Pipeline​

Step 5: Assemble and Generate​

Step 6: Store Agent Outputs (Optional)​

Advanced Patterns​

Multi-Workspace RAG​

RAG with Branch Isolation​

Versioned Knowledge Base​

Next Steps​