RAG Chat

Uses RAG (Retrieval Augmented Generation) to enhance chat interactions with relevant information.

Input

Autofill

Live stage

0.00s

Response

Noch keine Antwort.

Data Flow & Handling

What happens to user data

Processed

Message content, optional conversation memory, retrieval context, model output, and request metadata.

Models Used

Generation model: qwen2.5:3b

Embedding model: nomic-embed-text

Rerank model: qwen2.5:3b or disabled

System

Deployment: self-hosted

CPU: 6 vCPUs

GPU: none (CPU-only inference)

Technical pipeline details