RAG Anything MCP Server

An MCP (Model Context Protocol) server that provides comprehensive RAG (Retrieval-Augmented Generation) capabilities for processing and querying directories of documents using the raganything library with full multimodal support.

Features

End-to-End Document Processing: Complete document parsing with multimodal content extraction
Multimodal RAG: Support for images, tables, equations, and text processing
Batch Processing: Process entire directories with multiple file types
Advanced Querying: Both pure text and multimodal-enhanced queries
Multiple Query Modes: hybrid, local, global, naive, mix, and bypass modes
Vision Processing: Advanced image analysis using GPT-4V
Persistent Storage: RAG instances maintained per directory for efficient querying

Available Tools

`process_directory`

Process all files in a directory for comprehensive RAG indexing with multimodal support.

Required Parameters:

directory_path: Path to the directory containing files to process
api_key: OpenAI API key for LLM and embedding functions

Optional Parameters:

working_dir: Custom working directory for RAG storage
base_url: OpenAI API base URL (for custom endpoints)
file_extensions: List of file extensions to process (default: ['.pdf', '.docx', '.pptx', '.txt', '.md'])
recursive: Process subdirectories (default: True)
enable_image_processing: Enable image analysis (default: True)
enable_table_processing: Enable table extraction (default: True)
enable_equation_processing: Enable equation processing (default: True)
max_workers: Concurrent processing workers (default: 4)

`process_single_document`

Process a single document with full multimodal analysis.

Required Parameters:

file_path: Path to the document to process
api_key: OpenAI API key

Optional Parameters:

working_dir: Custom working directory for RAG storage
base_url: OpenAI API base URL
output_dir: Output directory for parsed content
parse_method: Document parsing method (default: "auto")
enable_image_processing: Enable image analysis (default: True)
enable_table_processing: Enable table extraction (default: True)
enable_equation_processing: Enable equation processing (default: True)

`query_directory`

Pure text query against processed documents using LightRAG.

Parameters:

directory_path: Path to the processed directory
query: The question to ask about the documents
mode: Query mode - "hybrid", "local", "global", "naive", "mix", or "bypass" (default: "hybrid")

`query_with_multimodal_content`

Enhanced query with additional multimodal content (tables, equations, etc.).

Parameters:

directory_path: Path to the processed directory
query: The question to ask
multimodal_content: List of multimodal content dictionaries
mode: Query mode (default: "hybrid")

Example multimodal_content:

[
  {
    "type": "table",
    "table_data": "Method,Accuracy\\nRAGAnything,95.2%\\nBaseline,87.3%",
    "table_caption": "Performance comparison"
  },
  {
    "type": "equation",
    "latex": "P(d|q) = \\frac{P(q|d) \\cdot P(d)}{P(q)}",
    "equation_caption": "Document relevance probability"
  }
]

`list_processed_directories`

List all directories that have been processed and are available for querying.

`get_rag_info`

Get detailed information about the RAG configuration and status for a directory.

Usage Examples

1. Basic Directory Processing

process_directory(
  directory_path="/path/to/documents",
  api_key="your-openai-api-key"
)

2. Advanced Directory Processing

process_directory(
  directory_path="/path/to/research_papers",
  api_key="your-openai-api-key",
  file_extensions=[".pdf", ".docx"],
  enable_image_processing=true,
  enable_table_processing=true,
  max_workers=6
)

3. Pure Text Query

query_directory(
  directory_path="/path/to/documents",
  query="What are the main findings in these research papers?",
  mode="hybrid"
)

4. Multimodal Query with Table Data

query_with_multimodal_content(
  directory_path="/path/to/documents",
  query="Compare these results with the document findings",
  multimodal_content=[{
    "type": "table",
    "table_data": "Method,Accuracy,Speed\\nRAGAnything,95.2%,120ms\\nBaseline,87.3%,180ms",
    "table_caption": "Performance comparison"
  }],
  mode="hybrid"
)

5. Single Document Processing

process_single_document(
  file_path="/path/to/important_paper.pdf",
  api_key="your-openai-api-key",
  enable_image_processing=true
)

Setup Requirements

1. Environment Variables

export OPENAI_API_KEY="your-openai-api-key-here"

2. Install Dependencies

uv sync

3. Run the MCP Server

python main.py

Query Modes Explained

hybrid: Combines local and global search (recommended for most use cases)
local: Focuses on local context and entity relationships
global: Provides broader, document-level insights and summaries
naive: Simple keyword-based search without graph reasoning
mix: Combines multiple approaches for comprehensive results
bypass: Direct access without RAG processing

Multimodal Content Types

The server supports processing and querying with:

Images: Automatic caption generation and visual analysis
Tables: Structure extraction and content analysis
Equations: LaTeX parsing and mathematical reasoning
Charts/Graphs: Visual data interpretation
Mixed Content: Combined analysis of multiple content types

API Configuration

The server uses OpenAI's APIs by default:

LLM: GPT-4o-mini for text processing
Vision: GPT-4o for image analysis
Embeddings: text-embedding-3-large (3072 dimensions)

You can customize the base_url parameter to use:

Azure OpenAI
OpenAI-compatible APIs
Custom model endpoints

File Support

Supported file formats include:

PDF documents
Microsoft Word (.docx)
PowerPoint presentations (.pptx)
Text files (.txt)
Markdown files (.md)
And more via the raganything library

Performance Notes

Concurrent Processing: Use max_workers to control parallel document processing
Memory Usage: Large documents with many images may require significant memory
API Costs: Vision processing (GPT-4o) is more expensive than text processing
Storage: Processed data is stored locally for efficient re-querying

RAG Anything MCP Server

Features

End-to-End Document Processing: Complete document parsing with multimodal content extraction
Multimodal RAG: Support for images, tables, equations, and text processing
Batch Processing: Process entire directories with multiple file types
Advanced Querying: Both pure text and multimodal-enhanced queries
Multiple Query Modes: hybrid, local, global, naive, mix, and bypass modes
Vision Processing: Advanced image analysis using GPT-4V
Persistent Storage: RAG instances maintained per directory for efficient querying

Available Tools

`process_directory`

Process all files in a directory for comprehensive RAG indexing with multimodal support.

Required Parameters:

directory_path: Path to the directory containing files to process
api_key: OpenAI API key for LLM and embedding functions

Optional Parameters:

working_dir: Custom working directory for RAG storage
base_url: OpenAI API base URL (for custom endpoints)
file_extensions: List of file extensions to process (default: ['.pdf', '.docx', '.pptx', '.txt', '.md'])
recursive: Process subdirectories (default: True)
enable_image_processing: Enable image analysis (default: True)
enable_table_processing: Enable table extraction (default: True)
enable_equation_processing: Enable equation processing (default: True)
max_workers: Concurrent processing workers (default: 4)

`process_single_document`

Process a single document with full multimodal analysis.

Required Parameters:

file_path: Path to the document to process
api_key: OpenAI API key

Optional Parameters:

working_dir: Custom working directory for RAG storage
base_url: OpenAI API base URL
output_dir: Output directory for parsed content
parse_method: Document parsing method (default: "auto")
enable_image_processing: Enable image analysis (default: True)
enable_table_processing: Enable table extraction (default: True)
enable_equation_processing: Enable equation processing (default: True)

`query_directory`

Pure text query against processed documents using LightRAG.

Parameters:

directory_path: Path to the processed directory
query: The question to ask about the documents
mode: Query mode - "hybrid", "local", "global", "naive", "mix", or "bypass" (default: "hybrid")

`query_with_multimodal_content`

Enhanced query with additional multimodal content (tables, equations, etc.).

Parameters:

directory_path: Path to the processed directory
query: The question to ask
multimodal_content: List of multimodal content dictionaries
mode: Query mode (default: "hybrid")

Example multimodal_content:

[
  {
    "type": "table",
    "table_data": "Method,Accuracy\\nRAGAnything,95.2%\\nBaseline,87.3%",
    "table_caption": "Performance comparison"
  },
  {
    "type": "equation",
    "latex": "P(d|q) = \\frac{P(q|d) \\cdot P(d)}{P(q)}",
    "equation_caption": "Document relevance probability"
  }
]

`list_processed_directories`

List all directories that have been processed and are available for querying.

`get_rag_info`

Get detailed information about the RAG configuration and status for a directory.

Usage Examples

1. Basic Directory Processing

process_directory(
  directory_path="/path/to/documents",
  api_key="your-openai-api-key"
)

2. Advanced Directory Processing

process_directory(
  directory_path="/path/to/research_papers",
  api_key="your-openai-api-key",
  file_extensions=[".pdf", ".docx"],
  enable_image_processing=true,
  enable_table_processing=true,
  max_workers=6
)

3. Pure Text Query

query_directory(
  directory_path="/path/to/documents",
  query="What are the main findings in these research papers?",
  mode="hybrid"
)

4. Multimodal Query with Table Data

query_with_multimodal_content(
  directory_path="/path/to/documents",
  query="Compare these results with the document findings",
  multimodal_content=[{
    "type": "table",
    "table_data": "Method,Accuracy,Speed\\nRAGAnything,95.2%,120ms\\nBaseline,87.3%,180ms",
    "table_caption": "Performance comparison"
  }],
  mode="hybrid"
)

5. Single Document Processing

process_single_document(
  file_path="/path/to/important_paper.pdf",
  api_key="your-openai-api-key",
  enable_image_processing=true
)

Setup Requirements

1. Environment Variables

export OPENAI_API_KEY="your-openai-api-key-here"

2. Install Dependencies

uv sync

3. Run the MCP Server

python main.py

Query Modes Explained

hybrid: Combines local and global search (recommended for most use cases)
local: Focuses on local context and entity relationships
global: Provides broader, document-level insights and summaries
naive: Simple keyword-based search without graph reasoning
mix: Combines multiple approaches for comprehensive results
bypass: Direct access without RAG processing

Multimodal Content Types

The server supports processing and querying with:

Images: Automatic caption generation and visual analysis
Tables: Structure extraction and content analysis
Equations: LaTeX parsing and mathematical reasoning
Charts/Graphs: Visual data interpretation
Mixed Content: Combined analysis of multiple content types

API Configuration

The server uses OpenAI's APIs by default:

LLM: GPT-4o-mini for text processing
Vision: GPT-4o for image analysis
Embeddings: text-embedding-3-large (3072 dimensions)

You can customize the base_url parameter to use:

Azure OpenAI
OpenAI-compatible APIs
Custom model endpoints

File Support

Supported file formats include:

PDF documents
Microsoft Word (.docx)
PowerPoint presentations (.pptx)
Text files (.txt)
Markdown files (.md)
And more via the raganything library

Performance Notes

Concurrent Processing: Use max_workers to control parallel document processing
Memory Usage: Large documents with many images may require significant memory
API Costs: Vision processing (GPT-4o) is more expensive than text processing
Storage: Processed data is stored locally for efficient re-querying

RAG Anything MCP Server

README Documentation

RAG Anything MCP Server

Features

Available Tools

process_directory

process_single_document

query_directory

query_with_multimodal_content

list_processed_directories

get_rag_info

Usage Examples

1. Basic Directory Processing

2. Advanced Directory Processing

3. Pure Text Query

4. Multimodal Query with Table Data

5. Single Document Processing

Setup Requirements

1. Environment Variables

2. Install Dependencies

3. Run the MCP Server

Query Modes Explained

Multimodal Content Types

API Configuration

File Support

Performance Notes

Quick Actions

Key Features

RAG Anything MCP Server

README Documentation

RAG Anything MCP Server

Features

Available Tools

process_directory

process_single_document

query_directory

query_with_multimodal_content

list_processed_directories

get_rag_info

Usage Examples

1. Basic Directory Processing

2. Advanced Directory Processing

3. Pure Text Query

4. Multimodal Query with Table Data

5. Single Document Processing

Setup Requirements

1. Environment Variables

2. Install Dependencies

3. Run the MCP Server

Query Modes Explained

Multimodal Content Types

API Configuration

File Support

Performance Notes

Quick Actions

Key Features

`process_directory`

`process_single_document`

`query_directory`

`query_with_multimodal_content`

`list_processed_directories`

`get_rag_info`

`process_directory`

`process_single_document`

`query_directory`

`query_with_multimodal_content`

`list_processed_directories`

`get_rag_info`