JUHE API Marketplace
MKhalusova avatar
MCP Server

Unstructured Document Processor MCP

A Model Context Protocol server that enables LLMs to extract and use content from unstructured documents across a wide variety of file formats.

6
GitHub Stars
8/23/2025
Last Updated
No Configuration
Please check the documentation below.

README Documentation

A Model Context Protocol server that provides unstructured document processing capabilities. This server enables LLMs to extract and use content from an unstructured document.

This repo is work in progress, proceed with caution :)

Supported file types:

{".abw", ".bmp", ".csv", ".cwk", ".dbf", ".dif", ".doc", ".docm", ".docx", ".dot",
 ".dotm", ".eml", ".epub", ".et", ".eth", ".fods", ".gif", ".heic", ".htm", ".html",
 ".hwp", ".jpeg", ".jpg", ".md", ".mcw", ".mw", ".odt", ".org", ".p7s", ".pages",
 ".pbd", ".pdf", ".png", ".pot", ".potm", ".ppt", ".pptm", ".pptx", ".prn", ".rst",
 ".rtf", ".sdp", ".sgl", ".svg", ".sxg", ".tiff", ".txt", ".tsv", ".uof", ".uos1",
 ".uos2", ".web", ".webp", ".wk2", ".xls", ".xlsb", ".xlsm", ".xlsx", ".xlw", ".xml",
 ".zabw"}

Prerequisites: You'll need:

Quick TLDR on how to add this MCP to your Claude Desktop:

  1. Clone the repo and set up the UV environment.
  2. Create a .env file in the root directory and add the following env variable: UNSTRUCTURED_API_KEY.
  3. Run the MCP server: uv run doc_processor.py
  4. Go to ~/Library/Application Support/Claude/ and create a claude_desktop_config.json. In that file add:
{
    "mcpServers": {
        "unstructured_doc_processor": {
            "command": "PATH/TO/YOUR/UV",
            "args": [
                "--directory",
                "ABSOLUTE/PATH/TO/YOUR/unstructured-mcp/",
                "run",
                "doc_processor.py"
            ],
            "disabled": false
        }
    }
}
  1. Restart Claude Desktop. You should now be able to use the MCP.

Quick Actions

Key Features

Model Context Protocol
Secure Communication
Real-time Updates
Open Source