Research is the highest-leverage work most developers do least efficiently.
Reading earnings releases, maintaining a personal knowledge base, tracking competitor positioning, and retrieving prior research context each consume hours that compound into weeks per year. The output rarely lives in one place — it's scattered across tabs, note apps, PDFs, and memory. When you need it, finding it takes almost as long as re-reading the source.
OpenClaw (previously known as ClawdBot and MoltBot) solves this through two complementary patterns: scheduled ingestion agents that pull and structure new information on a defined cadence, and retrieval agents that surface relevant knowledge from an existing corpus on demand. The 4 Research & Learning cases cover both.
The 4 community-verified configurations this page covers:
- AI Earnings Tracker — scheduled ingestion and structured summarization of company earnings data
- Personal Knowledge Base (RAG) — query your own notes and documents via natural language
- Market Research & Product Factory — autonomous multi-source market research with structured deliverables
- Semantic Memory Search — retrieve semantically relevant context from a continuously growing knowledge store
Three of the four cases benefit directly from large context windows. Nano Banana 2's 256K-token context window — available via WisGate — reduces or eliminates chunking overhead for most personal and small-team knowledge bases. Confirm the exact context window size from https://wisgate.ai/models before publication.
Run your first research agent before you finish reading. Open AI Studio and test the Opus generation prompt against a sample earnings document or a batch of your own notes — no infrastructure required. For long-document tasks, load your full corpus and confirm the 256K-context model returns grounded answers before writing a single integration. Get your API key at wisgate.ai/hall/tokens, trial credits included.
OpenClaw Configuration
Step 1 — Locate and Open the Configuration File
OpenClaw stores its configuration in a JSON file in your home directory. Open your terminal and edit the file at:
Using nano:
nano ~/.clawdbot/clawdbot.json
Step 2 — Add the WisGate Provider to Your Models Section
Copy and paste the following configuration into the models section of your clawdbot.json. This defines WisGate as a custom provider and registers Claude Opus with your preferred model settings.
"models": {
"mode": "merge",
"providers": {
"moonshot": {
"baseUrl": "https://api.wisgate.ai/v1",
"apiKey": "YOUR-WISGATE-API-KEY",
"api": "openai-completions",
"models": [
{
"id": "claude-opus-4-6",
"name": "Claude Opus 4.6",
"reasoning": false,
"input": ["text"],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 256000,
"maxTokens": 8192
}
]
}
}
}
Note: Replace
YOUR-WISGATE-API-KEYwith your key from wisgate.ai/hall/tokens. The"mode": "merge"setting adds WisGate's models alongside your existing providers without replacing them. To add additional models, duplicate the model entry block and update the"id"and"name"fields with the correct model IDs from wisgate.ai/models.
Step 3 — Save, Exit, and Restart OpenClaw
If using nano:
- Press
Ctrl + Oto write the file → pressEnterto confirm - Press
Ctrl + Xto exit the editor
Restart OpenClaw:
- Press
Ctrl + Cto stop the current session - Relaunch with:
openclaw tui
Once restarted, the WisGate provider and your configured Claude models will appear in the model selector.
LLM RAG Knowledge Base: The Architecture Pattern Behind 3 of the 4 Research Cases
Before the case walkthroughs, establish the shared architectural pattern. Three of the four cases use retrieval-augmented generation in some form. Understanding the pattern once makes all three configurations easier to set up correctly.
What RAG means for research agents: Retrieval-Augmented Generation grounds the model's response in retrieved documents rather than solely in training data. For research automation, this has one practical implication: output quality is bounded by knowledge base quality. A well-maintained, current corpus produces reliable output. A stale or poorly structured one produces confidently wrong output that is harder to catch than an obvious error.
The two-step pattern:
| Step | Function | Tooling |
|---|---|---|
| Retrieval | Find the most relevant documents or entries from the knowledge store given the query | Vector store, semantic search index, or large-context window (no retrieval step needed if full corpus fits) |
| Generation | Generate a response grounded strictly in the retrieved documents | claude-opus-4-5 via WisGate |
The 256K context shortcut — when it applies:
Traditional RAG requires three infrastructure components: an embedding model, a vector database, and a chunking pipeline. Setup time ranges from hours to days. For knowledge bases under approximately 200,000 tokens (roughly 150,000 words or 400–500 typical documents), Nano Banana 2's 256K context window eliminates all three: pass the full corpus as context, query directly.
This is the recommended starting architecture for the Personal Knowledge Base and Semantic Memory Search cases. Add vector store infrastructure only when the corpus grows beyond the context window limit. Confirm Nano Banana 2's exact context window from https://wisgate.ai/models before stating the token limit in any production documentation.
When full-context is insufficient: As corpora grow — particularly for the Earnings Tracker and Market Research cases where new documents are added continuously — the corpus will eventually exceed any single context window. Plan for chunking and vector retrieval at scale. For most individual developers, this threshold is years away from a cold start.
Model selection rationale for this category: All 4 cases default to claude-opus-4-5. Research outputs are read and acted upon — by the developer, by stakeholders, or by downstream agents. A fabricated citation, a wrong earnings figure, or an inaccurate competitive claim has downstream consequences that outlast the API call that produced it. Reasoning quality directly determines output trustworthiness in this category. Confirm Opus pricing from https://wisgate.ai/models.
OpenClaw API Research Automation: Why the 256K Context Model Changes RAG Setup Complexity
This section makes the WisGate model recommendation concrete before the case walkthroughs.
Two model patterns for Research & Learning:
| Task Type | Recommended Model | Endpoint | Why |
|---|---|---|---|
| Reasoning, synthesis, citation | claude-opus-4-5 | https://wisgate.ai/v1 (OpenAI-compatible) | Highest reasoning reliability for multi-source synthesis and citation accuracy |
| Long-document retrieval (corpus ≤ 256K tokens) | gemini-3.1-flash-image-preview (Nano Banana 2) | https://wisgate.ai/v1beta/models/ (Gemini-native) | 256K context eliminates chunking for most personal knowledge bases |
Architecture note: Nano Banana 2 uses the Gemini-native endpoint and must be called programmatically — it is not accessible through OpenClaw's chat interface. Use AI Studio for manual validation, or integrate the endpoint call as a retrieval step that passes retrieved content to OpenClaw's Opus generation layer.
Setup complexity comparison:
| Approach | Infrastructure Required | Setup Time | When to Use |
|---|---|---|---|
| Traditional RAG (chunking + vector store) | Embedding model, vector DB, chunking pipeline | Hours to days | Corpus > 256K tokens |
| Full-context with Nano Banana 2 (≤ 256K tokens) | None — pass corpus directly as context | Minutes | Starting point for most personal knowledge bases |
| Hybrid (Opus for reasoning + NB2 for retrieval) | NB2 API call as retrieval step before Opus generation | ~1 hour | When retrieval and synthesis require different models |
For the Personal Knowledge Base and Semantic Memory Search cases, start with the full-context approach. The hybrid approach is appropriate when the corpus grows beyond the context limit or when retrieval precision matters more than simplicity.
Pricing note: Confirm all per-token pricing for both Opus and Nano Banana 2 from https://wisgate.ai/models before calculating production costs. For image generation tasks (not applicable to text-based retrieval), WisGate's rate is $0.058/image versus the $0.068 Google official rate.
AI Research Agent API — Case 1: AI Earnings Tracker
What it does: Monitors a configured company watchlist, ingests earnings documents and financial disclosures on a scheduled basis, and generates structured summaries with consistent output fields: revenue versus estimate, EPS versus estimate, full-year guidance status, key management commentary, and material changes from prior quarter. Output is delivered to Slack, email, or a local file within hours of each earnings release.
Why it matters: Developers building fintech products, investor tools, or competitive intelligence systems need earnings data processed consistently and promptly. Manual reading of quarterly releases is time-intensive and produces inconsistent note quality. This automation processes every covered company's earnings in a uniform format, on schedule, without manual review at each cycle.
Recommended model: claude-opus-4-5
Earnings summaries are read and acted upon. Misrepresented guidance language, incorrectly stated revenue figures, or missed management commentary have direct downstream consequences. Opus is the minimum reliable tier for financial text synthesis — not because the task is conceptually complex, but because accuracy requirements are high and error consequences are material.
System Prompt Structure
Role block:
You are a financial analyst assistant. Your outputs are informational summaries only
and do not constitute investment advice. Every numerical claim must reference the
source document section or page. If a figure cannot be verified from the provided
document, state "Not stated in source" — do not estimate or infer.
Input schema: company name, ticker symbol, current earnings document text, prior quarter summary (for comparison), analyst consensus estimates where available.
Output schema — fixed sections:
- Revenue: actual vs. estimate (state estimate source), YoY change
- EPS: actual vs. estimate, diluted
- Full-year guidance: raised / lowered / maintained / withdrawn — with verbatim language from management
- Key management commentary: 3 bullet points maximum, direct quotes preferred
- Material changes from prior quarter: any new disclosures, segment changes, or accounting adjustments
- Risk flags: any language indicating uncertainty, revision likelihood, or regulatory exposure
- Disclaimer (required in every output): "This summary is for informational purposes only and does not constitute investment advice."
Citation rule: every numerical claim references the document section. No exceptions.
Configuration Steps in OpenClaw
- Complete universal setup with
claude-opus-4-5 - Define your company watchlist and data source (SEC EDGAR RSS feed, earnings API, or manual document upload)
- Paste the system prompt with output schema populated
- Build a pre-processing step that fetches the earnings document and formats it as plain text before passing to OpenClaw
- Test with one complete earnings document in AI Studio; verify every output schema field populates correctly and citations are present
- Confirm the disclaimer appears in every output before activating the schedule
Cost: One Opus call per earnings report per company. A 20-company watchlist at quarterly reporting (80 reports/year) has a predictable and calculable annual cost. Confirm Opus per-token pricing from https://wisgate.ai/models and state the arithmetic before publication.
[Link to: Full AI Earnings Tracker configuration →]
LLM RAG Knowledge Base — Case 2: Personal Knowledge Base
What it does: Indexes the developer's notes, documents, bookmarks, and research highlights into a queryable knowledge base. Accepts natural language queries and returns answers grounded strictly in the developer's own corpus — not the model's training data. Every response distinguishes between what is present in the corpus and what is absent.
Why it matters: Most developers accumulate research in at least four separate locations — a note app, a browser bookmark folder, a PDF archive, and some form of plain text. The information exists. Retrieval is the problem. This automation turns that scattered corpus into a single queryable knowledge base that answers questions about what the developer has previously read, saved, or noted.
Recommended Model Architecture
| Layer | Model | When |
|---|---|---|
| Retrieval | gemini-3.1-flash-image-preview (NB2, 256K context) | Corpus ≤ 256K tokens — pass full corpus as context, no chunking |
| Retrieval | Vector store + embedding model | Corpus > 256K tokens |
| Generation | claude-opus-4-5 | Always — synthesis and citation require highest reasoning reliability |
System Prompt Structure — Generation Agent
Role and grounding rule:
You are a personal knowledge assistant. All answers must be grounded in the provided
corpus. If the answer to a query is not present in the corpus, respond:
"Not found in your knowledge base." — do not supplement with training data.
Never fabricate citations. A fabricated source reference is worse than no answer.
Query handling: identify the most relevant corpus sections, cite the source document and section, synthesize across multiple relevant sections where they exist.
Uncertainty handling: if corpus coverage is partial, state what is known and flag the gap explicitly. Suggest what additional source types would fill it.
Output format per query:
- Answer: 2–4 sentences, grounded in corpus
- Source citations: document name + section or page
- Coverage confidence: High (direct answer found) / Partial (related content found, incomplete) / Not found
Configuration Steps in OpenClaw
- Complete universal setup with
claude-opus-4-5for the generation layer - Decide retrieval approach based on current corpus size: full-context (NB2) for corpora ≤ 256K tokens, chunked vector retrieval for larger corpora
- For the full-context approach: format your corpus as a single structured document with clear document separators; test in AI Studio with the full corpus loaded
- Paste the generation agent system prompt
- Test 10 representative queries: 5 that should return corpus-grounded answers and 5 that should return "Not found in your knowledge base" — verify both response types are handled correctly
- Activate as an on-demand OpenClaw conversation; no cron trigger required
The system prompt rule that determines reliability: "Never fabricate citations" must be stated as an explicit hard rule — not implied by the role definition. A knowledge base that returns plausible but invented source references destroys its own utility and is harder to detect than an obvious error.
[Link to: Full Personal Knowledge Base configuration →]
OpenClaw Use Cases — Case 3: Market Research & Product Factory
What it does: Runs an autonomous three-agent market research pipeline: a Scout agent identifies and gathers raw competitive intelligence from defined sources, an Analyst agent synthesizes findings into a structured research report, and a Product Strategist agent generates positioning recommendations and feature suggestions grounded in the research output. The pipeline produces reusable artifacts — not one-off reports — that can be re-run quarterly for updated competitive intelligence.
Why it matters: Market research is a prerequisite for product decisions that is consistently under-resourced in developer-led teams. This pipeline replaces hours of manual competitive reading with a structured, repeatable process that produces consistent output formats — suitable for stakeholder sharing or direct input into a product roadmap.
Recommended model: claude-opus-4-5 for all three agents
Multi-source competitive analysis requires cross-source consistency, accurate representation of competitor positioning, and nuanced product recommendations that reflect actual market context rather than generic frameworks. This is the category where the quality gap between Opus and Sonnet is most visible in delivered output.
Three-Agent Architecture
| Agent | Role | Input | Output |
|---|---|---|---|
| Scout | Identify sources; gather raw competitive data | Market definition, competitor list | Structured raw data organized by source |
| Analyst | Synthesize gathered data into structured findings | Scout output | Research report: market overview, competitor matrix, identified gaps |
| Product Strategist | Generate positioning and feature recommendations | Analyst output | Positioning brief, ranked feature recommendations, go-to-market signal |
System Prompt Structure — Analyst Agent
Role block:
You are a market research analyst. Outputs are structured for stakeholder consumption.
Every competitor claim must cite its source. Flag any source that appears outdated
(>12 months old) or whose reliability is unclear.
Output schema:
- Market Overview: 2–3 sentences on market size, growth direction, and primary buyer segment
- Competitor Matrix: table with columns — competitor name, primary positioning, key features, pricing tier, apparent gaps or weaknesses
- Market Gaps: ranked list of underserved needs with supporting evidence from the gathered data
- Data Quality Notes: sources flagged as outdated, paywalled, or low-reliability
Configuration Steps in OpenClaw
- Complete universal setup with
claude-opus-4-5 - Define your target market segment and initial competitor list as configuration variables in the Scout system prompt
- Run Scout first: input market definition; review the source list and gathered data before passing to Analyst
- Run Analyst with Scout output as input; validate Competitor Matrix entries manually against known facts before proceeding to the Product Strategist
- Run Product Strategist with Analyst output; review recommendations against your actual product context before storing
- Archive the final report as a dated artifact; re-run the pipeline quarterly; compare outputs across runs to track market movement
Cost: 3 Opus calls per full research cycle. Confirm per-token pricing from https://wisgate.ai/models and calculate quarterly and annual cost at your planned cadence. At quarterly frequency, the annual cost is 12 Opus calls total.
[Link to: Full Market Research & Product Factory configuration →]
OpenClaw Use Cases — Case 4: Semantic Memory Search
What it does: Maintains a growing knowledge store where every entry is retrievable by semantic meaning, not keyword matching. When queried, the agent finds entries that are conceptually relevant — including entries where the query wording differs from the entry wording — and returns them ranked by semantic relevance with context. Designed for developers who accumulate large volumes of loosely structured research, observations, and insights over time.
How it differs from the Personal Knowledge Base case: The Knowledge Base case is document-centric — you query documents you have explicitly saved in structured form. Semantic Memory Search is entry-centric — you log discrete knowledge items continuously (ideas, observations, one-line insights, quoted passages), and retrieve by concept when needed. The store grows with every logged item; retrieval quality compounds as the store grows.
Recommended Model Architecture
| Layer | Model | When |
|---|---|---|
| Semantic retrieval | gemini-3.1-flash-image-preview (NB2, 256K context) | Store ≤ 256K tokens (~2,500 entries at 100 tokens/entry) |
| Semantic retrieval | Embedding-based vector store | Store > 256K tokens |
| Ranking and answer generation | claude-opus-4-5 | Always |
System Prompt Structure
Role block:
You operate in two modes: logging mode and retrieval mode.
Logging mode (input is a new entry): confirm the entry is stored with its assigned id,
date, and auto-generated tags. Return: id, tags assigned, and any related existing
entries this entry connects to.
Retrieval mode (input is a query): return the top 5–10 entries ranked by conceptual
relevance. Never fabricate entries. If no sufficiently relevant entries exist,
state this explicitly and suggest the closest partial matches.
Entry schema: id (auto-incremented), content, date logged, auto-generated tags (3–5 per entry), source (optional).
Retrieval output format: ranked list with — entry id, content, date logged, relevance score (1–5), one-sentence explanation of why this entry is relevant to the query.
No-match handling: state explicitly when no entries are sufficiently relevant; suggest partial matches and tag queries to surface related areas.
Configuration Steps in OpenClaw
- Complete universal setup with
claude-opus-4-5for the generation and ranking layer - Choose retrieval approach: full-context (NB2 via Gemini-native endpoint) for stores ≤ 256K tokens; vector retrieval for larger stores
- Define your entry schema and initial tag taxonomy in the system prompt
- Seed the store with 20–50 existing notes or knowledge fragments — paste as a batch user message with one entry per line
- Test retrieval with 10 queries in AI Studio: verify that semantically related entries surface even when query wording differs significantly from entry wording (e.g., query "context window tradeoffs" retrieves an entry tagged "token limits" with related content)
- Establish a consistent daily logging habit — the store's value compounds with entry volume
The 256K context in practice: At an average of 100 tokens per knowledge entry, the 256K context window accommodates approximately 2,500 entries before vector retrieval is required. Confirm the exact context window size from https://wisgate.ai/models. For most individual developers logging daily observations, this covers multiple years of continuous use without infrastructure.
[Link to: Full Semantic Memory Search configuration →]
OpenClaw Use Cases: Research & Learning Category — Model and Architecture Reference
| # | Case | Primary Model | Retrieval Approach | RAG Pattern |
|---|---|---|---|---|
| 1 | AI Earnings Tracker | claude-opus-4-5 | Document passed directly — no retrieval step | Scheduled ingestion + structured output |
| 2 | Personal Knowledge Base | claude-opus-4-5 | NB2 full-context (≤ 256K) or vector store (larger) | Query-driven retrieval + grounded generation |
| 3 | Market Research & Product Factory | claude-opus-4-5 | Agent-based Scout retrieval — no vector store | Multi-agent research pipeline |
| 4 | Semantic Memory Search | claude-opus-4-5 | NB2 full-context (≤ 256K) or vector store (larger) | Continuous logging + semantic retrieval |
Why Opus for all 4 cases — stated as a cost argument:
Research outputs are acted upon. The error cost for this category — a wrong earnings figure, a fabricated knowledge base citation, an inaccurate competitive claim — exceeds the per-call cost of Opus at any reasonable research cadence. At 80 earnings reports per year, one quarterly research run, and daily knowledge base queries, the total annual Opus call volume is calculable and bounded. Confirm per-token pricing from https://wisgate.ai/models and run the arithmetic. For most developers, the annual cost of Opus for all 4 Research & Learning cases is less than one hour of debugging time spent tracing a bad research output.
The 256K context threshold:
Nano Banana 2's 256K context window eliminates chunking infrastructure for knowledge stores under approximately 2,500 entries at 100 tokens per entry, or approximately 400 typical documents at 500 tokens each. Confirm the exact context window size from https://wisgate.ai/models before publishing any token count claims. Use this as the decision threshold: if your current corpus fits, start with full-context retrieval. Add vector infrastructure when you outgrow it.
OpenClaw Use Cases: Research & Learning — Where to Start
Four complete research automation configurations for OpenClaw via WisGate. All four use claude-opus-4-5 for reasoning and synthesis. Two use Nano Banana 2's 256K context window to eliminate RAG infrastructure for corpora within the context limit.
Match the case to your current data:
- You have a corpus of existing notes and documents → Personal Knowledge Base — the simplest starting case; if your corpus fits in 256K tokens, the setup is one system prompt and your existing files
- You track public companies for competitive or investment context → AI Earnings Tracker — one Opus call per report, consistent output format, schedulable on earnings calendar
- You need structured competitive intelligence → Market Research & Product Factory — three Opus calls per research run, quarterly cadence, reusable artifact output
- You log ideas and observations continuously and need to retrieve them later → Semantic Memory Search — grows with use, no infrastructure until you exceed the context window
The lowest-friction starting point for most developers is the Personal Knowledge Base. If your corpus fits within the 256K token limit, there is no chunking pipeline to build, no vector database to configure, and no embedding model to deploy. The setup is a system prompt, a model selection, and your existing notes.
For the full 36-case OpenClaw library across all 6 categories, return to the [OpenClaw Use Cases pillar page →].
Your first research agent starts with your existing notes. Get your WisGate API key at wisgate.ai/hall/tokens — trial credits included, no commitment before the first query. Load your corpus into AI Studio, select
claude-opus-4-5, paste the Personal Knowledge Base system prompt, and run 5 test queries against your own material. For the Earnings Tracker, paste one complete earnings document and verify every output schema field populates with citations before activating the schedule. Both setups require no infrastructure beyond the API key. Start with the case that matches your current highest-value data source — the architecture scales to the others from there.
All context window sizes and per-token pricing require confirmation from wisgate.ai/models before publication. Insert confirmed figures for Nano Banana 2 context window (stated as 256K tokens — verify) and all Opus per-token rates into the cost tables before this article goes live. The "not investment advice" disclaimer is required in the Earnings Tracker system prompt description and must be preserved in all derivative content. Never fabricate citation rules must appear in the Personal Knowledge Base and Semantic Memory Search system prompt descriptions — do not remove in editing.