Knowledge Base (RAG) Integration

Name: Mockarty
Author: Mockarty

Mockarty supports connecting external knowledge bases to enrich AI agent responses with your company-specific documentation, API specifications, coding standards, and other reference materials.

About URLs in examples: All examples use localhost:5770 as the default Mockarty address. If your instance runs on a remote server, replace localhost:5770 with its actual address (e.g. https://mockarty.company.com or http://192.168.1.50:5770). See Tips & Useful Features for details.

MCP Marketplace page with connected RAG knowledge base

What is RAG?

Simple explanation: Imagine you hire a new developer. They are smart, but they do not know your company’s APIs, coding standards, or internal tools. RAG is like giving them a searchable handbook. Before answering a question, they check the handbook first, so their answers are accurate and specific to your company.

RAG (Retrieval-Augmented Generation) is a pattern where the AI searches a knowledge base for relevant information before generating a response. Instead of relying solely on the model’s training data, RAG retrieves your specific documents and injects them into the conversation context.

This means your AI agents can:

Know your company’s API specifications and generate accurate mocks
Follow your team’s report format and coding standards
Reference internal documentation when answering questions
Provide context-aware suggestions based on your project’s history

How It Works in Mockarty

Knowledge bases integrate through the MCP Marketplace as a RAG source type. Once connected, the AI agent gets two tools:

Tool	Description
`search_knowledge`	Search the knowledge base for relevant documents and context
`list_knowledge_collections`	List available document collections/datasets

When a user asks the AI agent a question, the agent can automatically search the knowledge base for relevant context before responding.

Provider Comparison

Before diving into details, here is a quick comparison to help you choose:

Provider	Ease of Setup	Resource Needs	Best For	License
AnythingLLM	Very easy (single container, drag & drop)	Low (1 container, ~1 GB RAM)	Quick start, small teams, prototyping	MIT
RAGFlow	Moderate (multi-container, needs config)	Medium (3+ containers, ~4 GB RAM)	Complex documents (PDFs with tables, images)	Apache 2.0
Dify	Moderate (multi-container)	Medium-High (5+ containers, ~4 GB RAM)	Advanced workflows, visual pipeline builder	Apache 2.0
R2R	Advanced (multi-container with PostgreSQL)	High (4+ containers, ~8 GB RAM)	Production RAG with knowledge graphs, multimodal	MIT
Custom API	Depends on your implementation	Depends	Full control, existing RAG infrastructure	N/A

Recommendation for beginners: Start with AnythingLLM – it is a single Docker container with a web UI and drag-and-drop document upload. You can switch to a more advanced provider later without changing your Mockarty configuration.

Supported RAG Providers

RAGFlow

Best for: Deep document understanding (tables, images, complex layouts)

Apache 2.0 license
Excellent PDF/DOCX parsing (DeepDoc engine)
Built-in MCP server support
REST API on port 9380

Quick Start:

git clone https://github.com/infiniflow/ragflow.git
cd ragflow
docker compose -f docker/docker-compose-base.yml up -d

Default URL: http://localhost:9380

After starting:

Open RAGFlow UI at http://localhost:9380
Create a dataset and upload documents
Get your API key from Settings

AnythingLLM

Best for: Simplest setup (single container, drag & drop)

MIT license
Built-in vector database (LanceDB)
Single Docker container
Web UI with drag & drop upload

Quick Start:

docker run -d -p 3001:3001 \
  --name anythingllm \
  -v anythingllm_data:/app/server/storage \
  mintplexlabs/anythingllm

Default URL: http://localhost:3001

After starting:

Open AnythingLLM at http://localhost:3001
Complete the setup wizard
Create a workspace and upload documents
Get your API key from Settings > Developer

Dify

Best for: Advanced RAG workflows with visual pipeline builder

Apache 2.0 license
Visual workflow editor
30+ vector database options
Bidirectional MCP support

Quick Start:

git clone https://github.com/langgenius/dify.git
cd dify/docker
docker compose up -d

Default URL: http://localhost:3000

After starting:

Open Dify at http://localhost:3000
Create a Knowledge Base and upload documents
Create an API key under Settings

R2R (SciPhi)

Best for: Production-grade RAG with knowledge graphs

MIT license
REST-first API design
Knowledge graph support
Multimodal ingestion (text, images, audio)

Quick Start:

git clone https://github.com/SciPhi-AI/R2R.git
cd R2R
docker compose -f compose.full.yaml --profile postgres up -d

Default URL: http://localhost:7272

Custom RAG API

You can connect any RAG system that provides a REST API with:

POST /search endpoint accepting {"query": "...", "top_k": 5}
GET /collections endpoint listing available document collections
GET /health endpoint for connection checking

Setup Guide

Method 1: Using Presets (Recommended)

Go to Admin > AI Settings > MCP Marketplace
Find the Knowledge Base (RAG) Presets section at the top
Click on your preferred provider card (e.g., RAGFlow)
Fill in the URL and authentication details
Click Save
The system will auto-discover the search_knowledge and list_knowledge_collections tools

Method 2: Manual Setup

Go to Admin > AI Settings > MCP Marketplace
Click Add Integration
Select RAG Knowledge Base as the source type
Enter your RAG server URL
Configure authentication:
- RAGFlow: Bearer token (API key from RAGFlow settings)
- AnythingLLM: Bearer token (API key from developer settings)
- Dify: Bearer token (API key from Dify settings)
- R2R: Bearer token or API key
- Custom: Depends on your API
Set the Description to include the provider name (e.g., “ragflow”, “anythingllm”) for automatic provider detection
Click Save and then Check Connection

Binding to AI Features

After adding a knowledge base:

Open any AI feature (e.g., Mock Builder, API Tester)
Click the gear icon to open AI Settings
In the MCP Tools section, find your knowledge base
Check the search_knowledge tool
The AI agent will now automatically search your knowledge base when relevant

For a detailed guide on AI settings, MCP tool selection, and custom prompts, see the AI Features documentation.

Use Cases

Mock Generation with Real API Specs

Upload your OpenAPI/Swagger specs to the knowledge base. When you ask the AI to create a mock:

User: "Create a mock for the payment service"
Agent: [searches knowledge base for "payment service API"]
Agent: [finds OpenAPI spec with endpoints, schemas, examples]
Agent: [creates accurate mock with real data structures]

Report Generation in Company Format

Upload report templates and past reports:

User: "Generate a performance test report"
Agent: [searches for "performance test report template"]
Agent: [finds company template with required sections]
Agent: [generates report in the correct format]

Onboarding and Documentation

Upload internal documentation, wikis, and guides:

User: "How do I configure stores in Mockarty?"
Agent: [searches knowledge base for "store configuration"]
Agent: [finds internal guide with team-specific conventions]
Agent: [responds with accurate, team-specific instructions]

Coding Standards Enforcement

Upload coding standards and style guides:

User: "Generate a test script for the orders API"
Agent: [searches for "test script standards"]
Agent: [applies naming conventions, required assertions, etc.]

Architecture

The RAG adapter in the MCP Marketplace:

Auto-detects the provider type from the server name/description
Uses provider-specific API endpoints for search and listing
Formats responses with source attribution (document name, similarity score)
Supports per-user authentication via header overrides

Tips

Start with RAGFlow or AnythingLLM for the simplest setup experience
Use specific collection names in searches for better precision
Upload focused documents rather than entire knowledge dumps
Keep documents up to date — RAG results are only as good as the source data
Use header overrides if different users need different API keys for the RAG system
Monitor the Token Budget in AI settings – knowledge base results consume context tokens

Troubleshooting

Connection check fails

Verify the RAG server is running: curl http://your-rag-server:port/health
Check that the URL in Mockarty matches the RAG server’s actual address (watch for Docker networking issues – see the note about localhost vs Docker service names)
Ensure the API key is correct and has not expired

AI agent does not use the knowledge base

Make sure the search_knowledge tool is enabled in the AI settings (gear icon) for the specific feature you are using
Check that the knowledge base contains documents relevant to your query – try searching directly in the RAG provider’s UI first
The AI decides when to search based on relevance. If your question is generic, the AI may skip the search. Be specific in your prompts.

Poor search results

Upload smaller, focused documents rather than large monolithic files
Use meaningful file names – some providers index them
If using RAGFlow, experiment with different chunking strategies in the dataset settings
Ensure documents are in a supported format (PDF, DOCX, TXT, Markdown are universally supported)