AI Chat and Semantic Search

Use AI to chat with your vault context via SSE streaming, summarize documents, find similar content, run inline writing assist, and search semantically with pgvector.

intermediate14 min read

What You'll Learn

Lifestream Vault's AI features connect large-language-model capabilities directly to your vault content. Every AI call is vault-aware — you can scope requests to a specific vault so the model answers with knowledge of your actual documents, not generic web data.

This guide covers the full AI surface:

  • Chat with vault context — start a streaming SSE conversation grounded in your vault content
  • Manage chat sessions — list, inspect, and delete conversation history
  • Summarize documents — get a structured summary and key topics from any document
  • Find similar documents — use pgvector cosine similarity to surface related content
  • Inline assist — transform or improve a selected passage with a natural-language instruction
  • Writing suggestions — run grammar, style, expand, or shorten passes on any text
  • Semantic and hybrid search — move beyond keyword matching with vector embeddings

By the end of this guide you will be able to:

  • Start a streaming AI chat session and consume the SSE response in code
  • Retrieve session history and continue an existing conversation
  • Programmatically summarize any document in your vault
  • Discover conceptually related documents even when they share no keywords
  • Request inline edits and writing improvements via the API
  • Choose the right search mode (text / semantic / hybrid) for each use case
Plan Required

Pro subscription required for all AI features. Encrypted vaults cannot use AI — the server cannot read encrypted content to build context. Node.js 18 or later is required for SDK and CLI usage.

Chat with Vault Context

The chat endpoint streams its response as Server-Sent Events (SSE). Each event carries a content chunk of the model's reply. The final event includes "done": true and a sessionId you can use to continue the conversation later.

Passing a vaultId scopes the model's context to that vault — it will draw on your documents when formulating answers. Omit it to chat without vault context.

The SDK provides two methods: client.ai.chat() for a simple Promise-based call that returns the complete response, and client.ai.chatStream() for SSE streaming via an AsyncGenerator.

typescript
import { LifestreamVaultClient } from '@lifestreamdynamics/vault-sdk';

// Authenticate — positional args: baseUrl, email, password
const { client, tokens, refreshToken } = await LifestreamVaultClient.login(
  'https://vault.lifestreamdynamics.com',
  'you@example.com',
  'your-password',
);

const VAULT_ID = 'vlt_abc123';

// --- Option 1: Simple Promise-based call (non-streaming) ---
const result = await client.ai.chat({
  message: 'Summarize the key decisions made in my Q1 planning documents.',
  vaultId: VAULT_ID,
});

console.log('Response:', result.message.content);
console.log('Sources:', result.message.sources);
console.log('Session ID for follow-up:', result.sessionId);
console.log('Tokens used:', result.tokensUsed);

// --- Option 2: Streaming via AsyncGenerator ---
// chatStream() returns an AsyncGenerator — do NOT await it
const stream = client.ai.chatStream({
  message: 'Summarize the key decisions made in my Q1 planning documents.',
  vaultId: VAULT_ID,
});

// Each chunk is { content: string }
let fullResponse = '';

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
  fullResponse += chunk.content;
}

// The sessionId is returned as the generator's return value.
// To capture it, use the iterator protocol directly:
// const { value: result } = await stream.return(undefined as any);
console.log('\n\nFull response length:', fullResponse.length);

The sessionId returned in the response (or the final SSE event) is your handle for this conversation. Pass it as sessionId in subsequent chat calls to continue the conversation with full history context. Sessions are retained for 180 days.

Manage Chat Sessions

Each conversation is persisted as a chat session containing the full message history. You can list all sessions, retrieve a specific session to inspect or replay history, or delete sessions you no longer need.

typescript
// List all chat sessions for the authenticated user
const sessions = await client.ai.listSessions();

console.log(`You have ${sessions.length} chat sessions:`);
for (const session of sessions) {
  console.log(`  [${session.id}] ${session.title ?? 'Untitled'} — created ${session.createdAt}`);
}

// Retrieve a specific session with its full message history
const SESSION_ID = 'sess_xyz789';
const { session, messages } = await client.ai.getSession(SESSION_ID);

console.log('Session title:', session.title);
console.log('Messages:');
for (const msg of messages) {
  console.log(`  [${msg.role}] ${msg.content.slice(0, 80)}...`);
}

// Continue an existing conversation (non-streaming)
const reply = await client.ai.chat({
  message: 'Which of those decisions has the most downstream risk?',
  vaultId: 'vlt_abc123',
  sessionId: SESSION_ID, // continue the previous session
});

console.log(reply.message.content);

// Or continue with streaming
const stream = client.ai.chatStream({
  message: 'What are the next steps for Q2?',
  vaultId: 'vlt_abc123',
  sessionId: SESSION_ID,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}

// Delete a session when it is no longer needed
await client.ai.deleteSession(SESSION_ID);
console.log('\nSession deleted.');

Summarize and Find Similar Documents

Two AI endpoints operate directly on individual documents rather than on free-form chat.

Summarize sends a document's content to the model and returns a structured summary with key topics and a token usage count. This is useful for dashboards, changelogs, and search result previews.

Find Similar uses the document's stored pgvector embedding to query cosine similarity against all other documents in the vault. Results are ranked by similarity score (0–1, higher = more similar). Embeddings are generated asynchronously after each document write — allow a few seconds after uploading before querying.

typescript
// Summarize a document — positional args: vaultId, documentPath
const summary = await client.ai.summarize('vlt_abc123', 'projects/q1-plan.md');

console.log('Summary:', summary.summary);
console.log('Key topics:', summary.keyTopics.join(', '));
console.log('Tokens used:', summary.tokensUsed);

// Find similar documents by vector similarity
const similar = await client.ai.similar({
  documentId: 'doc_def456', // the document whose embedding is the query
  vaultId: 'vlt_abc123',
  limit: 5,
});

console.log(`Found ${similar.similar.length} related documents:`);
for (const item of similar.similar) {
  console.log(`  ${item.path} — similarity: ${item.similarity.toFixed(4)}`);
}

Inline Assist and Writing Suggestions

Two endpoints power the editor's AI sidebar and can also be called directly from scripts or integrations.

Assist takes a block of text and a natural-language instruction (e.g. "make this more concise", "convert to bullet points", "translate to French") and returns a transformed version. An optional context string provides surrounding document text to help the model produce a coherent result.

Suggest runs a specific type of pass over a document and returns a targeted suggestion. The type parameter selects the pass; the response includes the suggestion and the type echoed back for routing on the client.

typescript
// Inline assist — transform a passage with a natural-language instruction
const assist = await client.ai.assist({
  vaultId: 'vlt_abc123',
  text: 'The system encountered an issue with the configuration management subsystem due to an unexpected state transition in the initialization sequence.',
  instruction: 'Rewrite this in plain English for a non-technical audience.',
  context: 'This is the executive summary section of our incident report.',
});

console.log('Rewritten:', assist.result);
console.log('Tokens used:', assist.tokensUsed);

// Writing suggestion — run a targeted pass on a document
const suggestion = await client.ai.suggest({
  vaultId: 'vlt_abc123',
  documentPath: 'blog/launch-post.md',
  type: 'style', // grammar | style | expand | shorten
});

console.log('Style suggestion:', suggestion.suggestion);
console.log('Suggestion type:', suggestion.type);
console.log('Tokens used:', suggestion.tokensUsed);

Suggestion types explained:

  • grammar — corrects spelling, punctuation, and grammatical errors without changing meaning
  • style — improves clarity, tone, and readability while preserving the original voice
  • expand — adds depth, examples, and supporting detail to thin content
  • shorten — trims redundancy and tightens prose to its essential meaning

All suggestions are non-destructive — the original document is unchanged. Review the returned suggestion and apply it manually or via a follow-up documents.put() call.

Semantic and Hybrid Search

The search endpoint accepts a mode parameter that selects the underlying retrieval strategy:

ModeEngineTierBest for
textPostgreSQL websearch_to_tsqueryFree+Keyword matches, exact phrases, tag filtering
semanticpgvector cosine similarityPro+Conceptual queries, finding related ideas, no shared keywords
hybridCombined text + vector (RRF fusion)Pro+Best overall relevance — recommended default for AI workflows

Endpoint: GET /api/v1/search?q=...&mode=text|semantic|hybrid&vault=&limit=&offset=

Semantic and hybrid modes require that documents have been indexed by the embedding.worker, which runs asynchronously after each write. New documents are typically indexed within a few seconds on a healthy deployment.

typescript
// Text search — available on all tiers
const textResults = await client.search.search({
  q: 'quarterly planning objectives',
  vault: 'vlt_abc123',
  mode: 'text',
  limit: 10,
});

console.log(`Text: ${textResults.total} results`);
for (const hit of textResults.results) {
  console.log(`  ${hit.path} — rank: ${hit.rank?.toFixed(4)}`);
}

// Semantic search — Pro tier, conceptual similarity via pgvector
const semanticResults = await client.search.search({
  q: 'documents about deadline pressure and milestone risk',
  vault: 'vlt_abc123',
  mode: 'semantic',
  limit: 10,
});

console.log(`Semantic: ${semanticResults.total} results`);
for (const hit of semanticResults.results) {
  console.log(`  ${hit.path} — similarity: ${hit.similarity?.toFixed(4)}`);
}

// Hybrid search — best relevance for most AI workflows
const hybridResults = await client.search.search({
  q: 'project launch strategy',
  vault: 'vlt_abc123',
  mode: 'hybrid',
  limit: 20,
  offset: 0,
});

console.log(`Hybrid: ${hybridResults.total} results`);
for (const hit of hybridResults.results) {
  console.log(`  ${hit.title} (${hit.path})`);
}

Tips & Best Practices

Scope Chat to a Vault for Relevant Answers

Always pass vaultId when calling /ai/chat for knowledge-base style questions. Without it, the model answers from general training data only. Scoping gives you vault-aware responses grounded in your actual documents.

Allow Time for Embeddings After Upload

Semantic and hybrid search rely on pgvector embeddings generated by the embedding.worker. The worker runs asynchronously — freshly uploaded documents may not be searchable semantically for a few seconds. Build a small delay or a retry loop into pipelines that upload then immediately query.

Use Session IDs to Continue Conversations

Save the sessionId returned by the first chat call and pass it back on follow-up messages. The model will have full conversational context without you needing to resend previous messages. Sessions are retained for 180 days.

Encrypted Vaults Cannot Use AI Features

AI endpoints (chat, summarize, similar, assist, suggest) read raw document content to build context. Encrypted vaults cannot be used with AI — decrypt the vault first, or keep AI-enabled content in a separate non-encrypted vault.

Grammar and Style Suggestions Are Non-Destructive

The /ai/suggest endpoint returns a suggestion string — it does not modify the document. Always review the suggestion before deciding whether to apply it. This is intentional: AI suggestions are advisory, not authoritative.

Hybrid Mode Combines the Strengths of Both Engines

mode=hybrid fuses full-text and vector results using Reciprocal Rank Fusion (RRF). It surfaces documents that match both by keyword and by concept, making it the recommended default for any AI-driven retrieval workflow where you want broad, relevant coverage.

What's Next

You now have a working foundation for all AI features in Lifestream Vault. Here are some natural next steps to build on this guide:

  • Manage Projects with Calendar — combine AI-powered summaries with due-date tracking and the activity heatmap for project dashboards
  • Build a Custom Integration — integrate AI chat and semantic search into your own application using the full SDK surface
  • Automate Your Vault — trigger AI summarization as part of a webhook-driven pipeline whenever documents are created or published
  • API Keys & Scoping — generate scoped API keys to give external services read access to AI endpoints without exposing your full account credentials