JUHE API Marketplace
Tathagat017 avatar
MCP Server

FastMCP Document Analyzer

A comprehensive document analysis server that performs sentiment analysis, keyword extraction, readability scoring, and text statistics while providing document management capabilities including storage, search, and organization.

0
GitHub Stars
11/17/2025
Last Updated
No Configuration
Please check the documentation below.
  1. Home
  2. MCP Servers
  3. Document-Analyser-MCP

README Documentation

๐Ÿ” FastMCP Document Analyzer

A comprehensive document analysis server built with the modern FastMCP framework

๐Ÿ“‹ Table of Contents

  • ๐ŸŒŸ Features
  • ๐Ÿš€ Quick Start
  • ๐Ÿ“ฆ Installation
  • ๐Ÿ”ง Usage
  • ๐Ÿ› ๏ธ Available Tools
  • ๐Ÿ“Š Sample Data
  • ๐Ÿ—๏ธ Project Structure
  • ๐Ÿ”„ API Reference
  • ๐Ÿงช Testing
  • ๐Ÿ“š Documentation
  • ๐Ÿค Contributing

๐ŸŒŸ Features

๐Ÿ“– Document Analysis

  • ๐ŸŽญ Sentiment Analysis: VADER + TextBlob dual-engine sentiment classification
  • ๐Ÿ”‘ Keyword Extraction: TF-IDF and frequency-based keyword identification
  • ๐Ÿ“š Readability Scoring: Multiple metrics (Flesch, Flesch-Kincaid, ARI)
  • ๐Ÿ“Š Text Statistics: Word count, sentences, paragraphs, and more

๐Ÿ—‚๏ธ Document Management

  • ๐Ÿ’พ Persistent Storage: JSON-based document collection with metadata
  • ๐Ÿ” Smart Search: TF-IDF semantic similarity search
  • ๐Ÿท๏ธ Tag System: Category and tag-based organization
  • ๐Ÿ“ˆ Collection Insights: Comprehensive statistics and analytics

๐Ÿš€ FastMCP Advantages

  • โšก Simple Setup: 90% less boilerplate than standard MCP
  • ๐Ÿ”’ Type Safety: Full type validation with Pydantic
  • ๐ŸŽฏ Modern API: Decorator-based tool definitions
  • ๐ŸŒ Multi-Transport: STDIO, HTTP, and SSE support

๐Ÿš€ Quick Start

1. Clone and Setup

git clone <repository-url>
cd document-analyzer
python -m venv venv
source venv/Scripts/activate  # Windows
# source venv/bin/activate    # macOS/Linux

2. Install Dependencies

pip install -r requirements.txt

3. Initialize NLTK Data

python -c "import nltk; nltk.download('punkt'); nltk.download('vader_lexicon'); nltk.download('stopwords'); nltk.download('punkt_tab')"

4. Run the Server

python fastmcp_document_analyzer.py

5. Test Everything

python test_fastmcp_analyzer.py

๐Ÿ“ฆ Installation

System Requirements

  • Python 3.8 or higher
  • 500MB free disk space
  • Internet connection (for initial NLTK data download)

Dependencies

fastmcp>=2.3.0      # Modern MCP framework
textblob>=0.17.1    # Sentiment analysis
nltk>=3.8.1         # Natural language processing
textstat>=0.7.3     # Readability metrics
scikit-learn>=1.3.0 # Machine learning utilities
numpy>=1.24.0       # Numerical computing
pandas>=2.0.0       # Data manipulation
python-dateutil>=2.8.2  # Date handling

Optional: Virtual Environment

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (macOS/Linux)
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

๐Ÿ”ง Usage

Starting the Server

Default (STDIO Transport)

python fastmcp_document_analyzer.py

HTTP Transport (for web services)

python fastmcp_document_analyzer.py --transport http --port 9000

With Custom Host

python fastmcp_document_analyzer.py --transport http --host 0.0.0.0 --port 8080

Basic Usage Examples

# Analyze a document
result = analyze_document("doc_001")
print(f"Sentiment: {result['sentiment_analysis']['overall_sentiment']}")

# Extract keywords
keywords = extract_keywords("Artificial intelligence is transforming healthcare", 5)
print([kw['keyword'] for kw in keywords])

# Search documents
results = search_documents("machine learning", 3)
print(f"Found {len(results)} relevant documents")

# Get collection statistics
stats = get_collection_stats()
print(f"Total documents: {stats['total_documents']}")

๐Ÿ› ๏ธ Available Tools

Core Analysis Tools

ToolDescriptionExample
analyze_document๐Ÿ” Complete document analysisanalyze_document("doc_001")
get_sentiment๐Ÿ˜Š Sentiment analysisget_sentiment("I love this!")
extract_keywords๐Ÿ”‘ Keyword extractionextract_keywords(text, 10)
calculate_readability๐Ÿ“– Readability metricscalculate_readability(text)

Document Management Tools

ToolDescriptionExample
add_document๐Ÿ“ Add new documentadd_document("id", "title", "content")
get_document๐Ÿ“„ Retrieve documentget_document("doc_001")
delete_document๐Ÿ—‘๏ธ Delete documentdelete_document("old_doc")
list_documents๐Ÿ“‹ List all documentslist_documents("Technology")

Search and Discovery Tools

ToolDescriptionExample
search_documents๐Ÿ” Semantic searchsearch_documents("AI", 5)
search_by_tags๐Ÿท๏ธ Tag-based searchsearch_by_tags(["AI", "tech"])
get_collection_stats๐Ÿ“Š Collection statisticsget_collection_stats()

๐Ÿ“Š Sample Data

The server comes pre-loaded with 16 diverse documents covering:

CategoryDocumentsTopics
Technology4AI, Quantum Computing, Privacy, Blockchain
Science3Space Exploration, Healthcare, Ocean Conservation
Environment2Climate Change, Sustainable Agriculture
Society3Remote Work, Mental Health, Transportation
Business2Economics, Digital Privacy
Culture2Art History, Wellness

Sample Document Structure

{
  "id": "doc_001",
  "title": "The Future of Artificial Intelligence",
  "content": "Artificial intelligence is rapidly transforming...",
  "author": "Dr. Sarah Chen",
  "category": "Technology",
  "tags": ["AI", "technology", "future", "ethics"],
  "language": "en",
  "created_at": "2024-01-15T10:30:00"
}

๐Ÿ—๏ธ Project Structure

document-analyzer/
โ”œโ”€โ”€ ๐Ÿ“ analyzer/                    # Core analysis engine
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ document_analyzer.py       # Sentiment, keywords, readability
โ”œโ”€โ”€ ๐Ÿ“ storage/                     # Document storage system
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ document_storage.py        # JSON storage, search, management
โ”œโ”€โ”€ ๐Ÿ“ data/                        # Sample data
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ sample_documents.py        # 16 sample documents
โ”œโ”€โ”€ ๐Ÿ“„ fastmcp_document_analyzer.py # ๐ŸŒŸ Main FastMCP server
โ”œโ”€โ”€ ๐Ÿ“„ test_fastmcp_analyzer.py    # Comprehensive test suite
โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt            # Python dependencies
โ”œโ”€โ”€ ๐Ÿ“„ documents.json              # Persistent document storage
โ”œโ”€โ”€ ๐Ÿ“„ README.md                   # This documentation
โ”œโ”€โ”€ ๐Ÿ“„ FASTMCP_COMPARISON.md       # FastMCP vs Standard MCP
โ”œโ”€โ”€ ๐Ÿ“„ .gitignore                  # Git ignore patterns
โ””โ”€โ”€ ๐Ÿ“ venv/                       # Virtual environment (optional)

๐Ÿ”„ API Reference

Document Analysis

analyze_document(document_id: str) -> Dict[str, Any]

Performs comprehensive analysis of a document.

Parameters:

  • document_id (str): Unique document identifier

Returns:

{
  "document_id": "doc_001",
  "title": "Document Title",
  "sentiment_analysis": {
    "overall_sentiment": "positive",
    "confidence": 0.85,
    "vader_scores": {...},
    "textblob_scores": {...}
  },
  "keywords": [
    {"keyword": "artificial", "frequency": 5, "relevance_score": 2.3}
  ],
  "readability": {
    "flesch_reading_ease": 45.2,
    "reading_level": "Difficult",
    "grade_level": "Grade 12"
  },
  "basic_statistics": {
    "word_count": 119,
    "sentence_count": 8,
    "paragraph_count": 1
  }
}

get_sentiment(text: str) -> Dict[str, Any]

Analyzes sentiment of any text.

Parameters:

  • text (str): Text to analyze

Returns:

{
  "overall_sentiment": "positive",
  "confidence": 0.85,
  "vader_scores": {
    "compound": 0.7269,
    "positive": 0.294,
    "negative": 0.0,
    "neutral": 0.706
  },
  "textblob_scores": {
    "polarity": 0.5,
    "subjectivity": 0.6
  }
}

Document Management

add_document(...) -> Dict[str, str]

Adds a new document to the collection.

Parameters:

  • id (str): Unique document ID
  • title (str): Document title
  • content (str): Document content
  • author (str, optional): Author name
  • category (str, optional): Document category
  • tags (List[str], optional): Tags list
  • language (str, optional): Language code

Returns:

{
  "status": "success",
  "message": "Document 'my_doc' added successfully",
  "document_count": 17
}

Search and Discovery

search_documents(query: str, limit: int = 10) -> List[Dict[str, Any]]

Performs semantic search across documents.

Parameters:

  • query (str): Search query
  • limit (int): Maximum results

Returns:

[
  {
    "id": "doc_001",
    "title": "AI Document",
    "similarity_score": 0.8542,
    "content_preview": "First 200 characters...",
    "tags": ["AI", "technology"]
  }
]

๐Ÿงช Testing

Run All Tests

python test_fastmcp_analyzer.py

Test Categories

  • โœ… Server Initialization: FastMCP server setup
  • โœ… Sentiment Analysis: VADER and TextBlob integration
  • โœ… Keyword Extraction: TF-IDF and frequency analysis
  • โœ… Readability Calculation: Multiple readability metrics
  • โœ… Document Analysis: Full document processing
  • โœ… Document Search: Semantic similarity search
  • โœ… Collection Statistics: Analytics and insights
  • โœ… Document Management: CRUD operations
  • โœ… Tag Search: Tag-based filtering

Expected Test Output

=== Testing FastMCP Document Analyzer ===

โœ“ FastMCP server module imported successfully
โœ“ Server initialized successfully
โœ“ Sentiment analysis working
โœ“ Keyword extraction working
โœ“ Readability calculation working
โœ“ Document analysis working
โœ“ Document search working
โœ“ Collection statistics working
โœ“ Document listing working
โœ“ Document addition and deletion working
โœ“ Tag search working

=== All FastMCP tests completed successfully! ===

๐Ÿ“š Documentation

Additional Resources

  • ๐Ÿ“– FastMCP Documentation
  • ๐Ÿ“– MCP Protocol Specification
  • ๐Ÿ“– FASTMCP_COMPARISON.md - FastMCP vs Standard MCP

Key Concepts

Sentiment Analysis

Uses dual-engine approach:

  • VADER: Rule-based, excellent for social media text
  • TextBlob: Machine learning-based, good for general text

Keyword Extraction

Combines multiple approaches:

  • TF-IDF: Term frequency-inverse document frequency
  • Frequency Analysis: Simple word frequency counting
  • Relevance Scoring: Weighted combination of both methods

Readability Metrics

Provides multiple readability scores:

  • Flesch Reading Ease: 0-100 scale (higher = easier)
  • Flesch-Kincaid Grade: US grade level
  • ARI: Automated Readability Index

Document Search

Uses TF-IDF vectorization with cosine similarity:

  • Converts documents to numerical vectors
  • Calculates similarity between query and documents
  • Returns ranked results with similarity scores

๐Ÿค Contributing

Development Setup

# Clone repository
git clone <repository-url>
cd document-analyzer

# Create development environment
python -m venv venv
source venv/Scripts/activate  # Windows
pip install -r requirements.txt

# Run tests
python test_fastmcp_analyzer.py

Adding New Tools

FastMCP makes it easy to add new tools:

@mcp.tool
def my_new_tool(param: str) -> Dict[str, Any]:
    """
    ๐Ÿ”ง Description of what this tool does.

    Args:
        param: Parameter description

    Returns:
        Return value description
    """
    # Implementation here
    return {"result": "success"}

Code Style

  • Use type hints for all functions
  • Add comprehensive docstrings
  • Include error handling
  • Follow PEP 8 style guidelines
  • Add emoji icons for better readability

Testing New Features

  1. Add your tool to the main server file
  2. Create test cases in the test file
  3. Run the test suite to ensure everything works
  4. Update documentation as needed

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • FastMCP Team for the excellent framework
  • NLTK Team for natural language processing tools
  • TextBlob Team for sentiment analysis capabilities
  • Scikit-learn Team for machine learning utilities

Made with โค๏ธ using FastMCP

๐Ÿš€ Ready to analyze documents? Start with python fastmcp_document_analyzer.py

Quick Actions

View on GitHubView All Servers

Key Features

Model Context Protocol
Secure Communication
Real-time Updates
Open Source

Boost your projects with Wisdom Gate LLM API

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Learn More
JUHE API Marketplace

Accelerate development, innovate faster, and transform your business with our comprehensive API ecosystem.

JUHE API VS

  • vs. RapidAPI
  • vs. API Layer
  • API Platforms 2025
  • API Marketplaces 2025
  • Best Alternatives to RapidAPI

For Developers

  • Console
  • Collections
  • Documentation
  • MCP Servers
  • Free APIs
  • Temp Mail Demo

Product

  • Browse APIs
  • Suggest an API
  • Wisdom Gate LLM
  • Global SMS Messaging
  • Temp Mail API

Company

  • What's New
  • Welcome
  • About Us
  • Contact Support
  • Terms of Service
  • Privacy Policy
Featured on Startup FameFeatured on Twelve ToolsFazier badgeJuheAPI Marketplace - Connect smarter, beyond APIs | Product Huntai tools code.marketDang.ai
Copyright ยฉ 2025 - All rights reserved