JUHE API Marketplace
statikfintechllc avatar
MCP Server

Gremlin Web Scraper MCP

A lightweight HTTP module that enables scraping visible text from any publicly accessible webpage, integrating directly with VS Code's MCP system.

1
GitHub Stars
8/18/2025
Last Updated
No Configuration
Please check the documentation below.

README Documentation

Repo Ticker Stats

Gremlin Web Scraper MCP

GremlinScraper is a lightweight HTTP MCP module designed to scrape visible text from any publicly accessible webpage. It runs locally, integrates directly with VS Code’s MCP system, and speaks plain JSON.

This is Part 1 of the GremlinOS Runtime Suite from StatikFinTech LLC.

🔎 Expand to see our Work 🔍

Make your Own App:

Just a funny, this app is cool though, I use it as a Game

Run on Replit

⚠️ Just Until GremlinGPT is booting, Soon ⚠️


Ascend Institute Traffic

The world’s first RS-RACS
Recursive, Self-Referential Autonomous Cognitive System

Reset: After 7:00pm CST on First 2 Clones

🤝 Open for Support 🤝

Support This Project

If you find this project valuable and want to support its continued development, consider sponsoring or contributing

Sponsor via:

Support Options


Funding and Infrastructure Goals

GremlinGPT is growing. It learns(as do I while finishing GremlinGPT). It Launches(Soon). The project is reaching the limits of what a solo builder can finish without external support. The next stage—deployment, agent orchestration, and memory stability, an infrastructure investment to move into full time Development.

Funding Target is $500,000

Funds will be used to secure:

  • A Trading account to be able to move into Full-Time Development
  • Dedicated small GPU cluster (RTX 4090 × 2 minimum)
  • Persistent vector DB for others and hosted runtime servers for others who cant afford a system
  • Ensuring Secure DevOps pipeline for offline + encrypted agents for the ones who cant secure thier own hardware.

If You Are

  • A founder with cloud real estate and idle GPUs
  • A data center operator who understands sovereign AI
  • An investor looking for a stake in recursive autonomy

🧬 Paging:
@elonmusk
@openai
@deepmind
@mistralAI

If you get it, run the loop.


🧠 Features

  • MCP-Compatible: Shows up in VS Code’s MCP list with metadata.
  • Simple API: POST a URL, receive clean text in return.
  • CORS-Ready: Built-in CORS support for cross-origin requests.
  • Logging: Uses loguru to log all activity to rotating files.
  • Timeouts + Error Handling: Gracefully deals with slow or weird sites.
  • Human UA Header: Doesn’t look like a bot (unless you read the name).

🔧 Usage

  1. Clone or drop this repo into your .vscode/mcps/ or wherever your MCPs live.
  2. Add "gremlinScraper" to .mcp.json.
  3. Click “Start Server” in the VS Code MCP tab.
  4. Or run it manually:
    pip install -r requirements.txt
    python server.py
    

📦 Endpoints & Examples

1. POST /scrape

  • Fetch a single page’s visible text:
curl -X POST http://localhost:8742/scrape \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com"}'
  • Response:
{
  "text": "Example Domain\n\nThis domain is for use in illustrative examples in documents.\n..."
}

2. POST /crawl

  • Recursively crawl same-domain links:
curl -X POST http://localhost:8742/crawl \
  -H 'Content-Type: application/json' \
  -d '{
    "url":"https://example.com",
    "max_pages":10,
    "max_depth":2,
    "concurrency":5
  }'
  • Response:
{
  "https://example.com": "Example Domain\n\nThis domain is for use…",
  "https://example.com/about": "About Us\n\n…",
  "...": "…"
}

3. POST /crawl-stream

  • Stream each page as soon as it’s fetched:
curl -N -X POST http://localhost:8742/crawl-stream \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com","max_pages":5}'
  • Response (NDJSON):
{"url":"https://example.com","text":"Example Domain\n…"}
{"url":"https://example.com/link1","text":"Link One\n…"}

4. GET /ping

  • Health check endpoint:

curl http://localhost:8742/ping

  • Response:

pong

5. GET /mcp/metadata

  • MCP discovery metadata:

curl http://localhost:8742/mcp/metadata

  • Response:
{
  "name":"Gremlin Web Scraper MCP",
  "description":"Scrapes and crawls text from URLs via HTTP endpoints…",
  "version":"0.0.1",
  "author":"StatikFinTech LLC",
  "tags":["scraping","crawl","MCP","runtime"],
  "endpoints":[]
}

🗂 Metadata

Name: Gremlin Web Scraper MCP
Author: StatikFinTech LLC
License: MIT
Tags: #scraping, #crawl, #runtime, #gremlin


🐾 Future Add-ons

  • PDF / EPUB / Markdown parsing
  • Selective DOM element filtering
  • Scheduling/recurring crawl and scrap jobs
  • Direct Memory injection to GremlinGPT core

“Split. Streamlined. Sovereign.” StatikFinTech Systems • 2025


[!CAUTION]

“Your qualifications are impressive...”

  • Coder Hiring Team (2025 Rejection Letter)

🔱 "This isn't rejection. It's proof they don't know how to build what comes next.

Still building what they can’t classify." 🔱 -StatikFinTech, LLC

Quick Actions

Key Features

Model Context Protocol
Secure Communication
Real-time Updates
Open Source