OpenClaw Tech News Digest: Aggregate 109 Sources Daily
By the end of this tutorial, you'll have a daily pipeline that fetches from over 109 sources, scores each item for relevance and novelty, deduplicates overlapping stories, and delivers a ranked digest every morning—while realizing significant per-call cost savings with WisGate. Validate your scoring logic easily in AI Studio before running the full pipeline.
AI Tech News Digest Automation: Why 109 Sources Require a Pipeline
Manually tracking tech news across 109 diverse sources is impossible. RSS feeds, GitHub trending repositories, Twitter/X accounts, and live web search queries each provide valuable signals, but aggregating them manually creates bottlenecks. You need a system that fetches all sources in parallel, evaluates each story for relevance, removes duplicates, and ranks the results—all automatically.
This is where OpenClaw shines. OpenClaw is a command-line tool designed to orchestrate multi-source data pipelines using large language models. Instead of writing custom fetch logic for each source type, you define your sources in a simple YAML file, configure your LLM provider, and let OpenClaw handle the rest. The tool fetches from all 109 sources, caches the raw results, and then passes them through quality scoring and deduplication stages powered by Claude models.
The scale problem becomes visible when you calculate the cost. Fetching and processing 109 sources daily means thousands of API calls per day. Direct API pricing from major providers can add up quickly. This is where WisGate's unified API routing becomes valuable. By routing your requests through WisGate, you access the same Claude models at lower per-call costs, making high-volume aggregation economically viable.
Without a pipeline, you're stuck refreshing news manually or paying premium rates for each API call. With OpenClaw and WisGate, you automate the entire workflow and reduce costs simultaneously. The investment in setup pays for itself within weeks at this scale.
OpenClaw API News Aggregation: Four Source Types, Two Endpoints
OpenClaw supports four primary source types, each requiring different fetch methods and API endpoints. Understanding these distinctions is critical for configuring your pipeline correctly.
RSS Feeds: Traditional RSS feeds from tech blogs, news sites, and industry publications. OpenClaw fetches these via HTTP GET requests and parses the XML. No API key required for most public feeds. Examples include Hacker News, TechCrunch, and The Verge.
GitHub Trending Repositories: GitHub's trending page surfaces the most active open-source projects. OpenClaw queries GitHub's public API to retrieve trending repos by language and time range. This requires a GitHub API token but provides real-time signals about what developers are building.
Twitter/X Accounts: Follow specific accounts or search terms on X (formerly Twitter). OpenClaw integrates with the X API v2 to stream tweets from your configured accounts. This requires X API credentials but captures breaking news and expert commentary in real time.
Web Search Queries: Live web search results for specific keywords or topics. This is where the distinction between WisGate endpoints becomes important. WisGate offers two endpoint types:
- OpenAI-compatible endpoint (
https://api.wisgate.ai/v1): Used for RSS, GitHub, and X data. These sources are already fetched and cached, so you're passing structured text to the LLM for scoring and summarization. - Gemini-native endpoint (
https://wisgate.ai/v1beta/models/): Used for grounded web search queries. This endpoint includes access to Google Search tooling, allowing the LLM to perform live web searches and ground its responses in current information.
The endpoint distinction matters because grounded web search requires real-time internet access, which only the Gemini-native endpoint provides. If you try to run web search queries through the OpenAI-compatible endpoint, you'll get stale or incomplete results. Conversely, for cached sources like RSS and GitHub, the OpenAI-compatible endpoint is faster and more cost-effective.
Configuring OpenClaw with WisGate: Step-by-Step JSON Setup
OpenClaw stores its configuration in a JSON file in your home directory. Follow these steps to integrate WisGate as your LLM provider.
Step 1: Open the Configuration File
Open your terminal and edit the OpenClaw configuration:
nano ~/.openclaw/openclaw.json
Step 2: Add the WisGate Provider Configuration
Copy and paste the following configuration into your models section. This defines a custom provider called "moonshot" that points to WisGate's OpenAI-compatible endpoint:
"models": {
"mode": "merge",
"providers": {
"moonshot": {
"baseUrl": "https://api.wisgate.ai/v1",
"apiKey": "WISGATE-API-KEY",
"api": "openai-completions",
"models": [
{
"id": "[REDACTED]",
"name": "Claude Opus 4.6",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 256000,
"maxTokens": 8192
}
]
}
}
}
Replace WISGATE-API-KEY with your actual WisGate API key. You can obtain your API key from https://wisgate.ai/hall/tokens. For model pricing and availability, check https://wisgate.ai/models.
Step 3: Save and Restart
Press Ctrl + O to save, then press Enter. Press Ctrl + X to exit the editor. Restart OpenClaw by pressing Ctrl + C to stop the current process, then run:
openclaw tui
Your OpenClaw instance is now configured to use WisGate's Claude models. The configuration uses two specific models for different stages of the pipeline: Claude Haiku for per-item quality scoring and Claude Sonnet for deduplication and ranking. We'll cover those in the next section.
LLM Multi-Source Summarization: Quality Scoring and Deduplication Prompts
The heart of your aggregation pipeline is the two-pass LLM workflow. First, Claude Haiku scores each item individually for relevance and novelty. Second, Claude Sonnet deduplicates overlapping stories and ranks the final digest.
Quality Scoring with Claude Haiku
After fetching all 109 sources, OpenClaw passes each story to Claude Haiku with a quality scoring prompt. Haiku is fast and cost-effective for this repetitive task. Here's the [REDACTED]:
You are a tech news quality scorer. Evaluate each story on a scale of 1–10 based on:
- Relevance: Is this story relevant to software developers, AI researchers, or tech product builders?
- Novelty: Is this story new information, or a repeat of earlier coverage?
- Impact: Does this story signal a meaningful shift in technology or industry?
Respond with a JSON object:
{
"score": <1-10>,
"reason": "<brief explanation>",
"category": "<AI, Infrastructure, Security, Developer Tools, etc.>"
}
Discard stories scoring below 5. These are typically press releases, minor updates, or off-topic content.
OpenClaw runs this prompt against each of the 109 fetched items. Stories scoring 5 or higher are cached for the deduplication stage. This filtering step reduces noise and ensures your final digest contains only high-signal stories.
Deduplication and Ranking with Claude Sonnet
Once all items are scored, Claude Sonnet takes the high-scoring stories and deduplicates them. Multiple sources often cover the same story—for example, a major GitHub release might appear in RSS feeds, on Twitter, and in web search results. Sonnet identifies these duplicates and merges them into a single entry, preserving the most complete information.
Here's the deduplication [REDACTED]:
You are a tech news deduplication and ranking engine. You receive a list of scored tech stories from multiple sources. Your tasks:
1. Identify duplicate stories (same event covered by different sources).
2. Merge duplicates into single entries, preserving the most complete information.
3. Rank the deduplicated stories by impact and relevance.
4. Format the final digest as a numbered list with title, summary, sources, and score.
Output format:
1. [Title]
Summary: [2-3 sentences]
Sources: [list of original sources]
Score: [average score from duplicates]
2. [Next story]
...
Sonnet is more capable than Haiku and handles the complex reasoning required for deduplication. The output is your final ranked digest, ready to send to subscribers or display on a dashboard.
Daily Cron Schedule and Grounded Web Search Configuration
Automation requires scheduling. Your pipeline runs on a staggered daily cron schedule to fetch, score, deduplicate, and deliver by a fixed time each morning.
Cron Schedule
Here's a typical schedule:
- 05:00: Fetch all 109 sources (RSS, GitHub, X, web search). Results are cached in
~/.openclaw/news-digest/raw/[DATE]/. - 06:00: Score each fetched item using Claude Haiku. High-scoring items (≥5) are cached separately.
- 06:30: Deduplicate and rank using Claude Sonnet. Output is the final digest.
- 06:45: Deliver the digest via email, Slack, or API webhook.
This schedule ensures your digest is ready by 07:00 each morning. Adjust times based on your timezone and preferred delivery time.
Grounded Web Search Configuration
For web search queries, you'll use the Gemini-native endpoint instead of the OpenAI-compatible endpoint. This endpoint includes Google Search tooling, allowing real-time web searches.
Here's an example API call for grounded web search:
curl -X POST https://wisgate.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent \
-H "Content-Type: application/json" \
-H "Authorization: Bearer WISGATE-API-KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Search for the latest AI model releases this week and summarize the top 3."
}]
}],
"tools": [{
"googleSearch": {}
}]
}'
The googleSearch tool enables the model to perform live web searches. This is distinct from the OpenAI-compatible endpoint, which cannot access real-time search. Use this endpoint only for web search queries; use the OpenAI-compatible endpoint for cached sources.
Caching raw fetches in ~/.openclaw/news-digest/raw/[DATE]/ is required to avoid re-fetching all 109 sources on scoring reruns. If a scoring pass fails, you can re-run it without waiting for all sources to fetch again.
OpenClaw Use Cases: 109 Sources, One Ranked Digest, Every Morning
Your fully configured pipeline is now ready to deploy. Here's how to bring it all together.
Deployment Steps
- Populate your
sources.yamlfile with your 109 sources, organized by type (RSS, GitHub, X, web search). - Validate your quality scoring prompt by running a test batch through Claude Haiku in AI Studio at https://wisgate.ai/studio/image.
- Run a full pipeline test with a small subset of sources (e.g., 10 sources) to verify fetch, score, and deduplication stages.
- Activate the daily cron schedule once testing is complete.
- Monitor the digest output for the first week to ensure quality and catch any configuration issues.
Endpoint Routing Rules
- Use
https://api.wisgate.ai/v1(OpenAI-compatible) for RSS, GitHub, and X sources. - Use
https://wisgate.ai/v1beta/models/(Gemini-native) for web search queries only. - Use Claude Haiku (
[REDACTED]) for per-item quality scoring. - Use Claude Sonnet (
[REDACTED]) for deduplication and ranking.
Next Steps
Your source list format, scoring and deduplication system prompts, endpoint routing details, and annual cost analysis are now complete. All you need is a WisGate API key to start. Populate sources.yaml with your initial sources, validate the scoring batch, run a full pipeline, and activate the daily cron to automate your tech news digest. Visit https://wisgate.ai/ to create your account and obtain your API key, then check https://wisgate.ai/models for the latest model pricing and availability.