JUHE API Marketplace

OpenClaw Infrastructure & DevOps Use Cases: 2 Always-On Agent Configurations for Self-Healing Systems

18 min read
By Liam Walker

Every other category in the OpenClaw library produces output. Infrastructure agents produce actions — and actions have system consequences.

A misconfigured digest bot delivers a bad summary. You delete the file and move on. A misconfigured infrastructure agent with SSH access can restart the wrong service, modify a config file, cascade a minor failure into an outage, or — in the worst case — leave a system in an indeterminate state that's harder to recover than the original problem. The failure mode is not a bad file. It is a degraded system.

This is the category where configuration discipline matters most. Two use cases are covered in this guide:

  1. n8n Workflow Orchestration — the agent delegates API calls to n8n via named webhooks; it never touches credentials directly
  2. Self-Healing Home Server — an always-on agent with SSH access, cron-based health monitoring, autonomous remediation within a defined permission boundary, and structured escalation when it hits that boundary

Both cases require the same upfront investment: a dedicated WisGate API key, an explicit permission boundary block in the system prompt, and validated escalation logic before the agent touches any live system. This guide covers all three.


Before connecting any agent to live infrastructure: Open AI Studio and test your system prompt against synthetic incident scenarios — service crashes, disk threshold breaches, unexpected log patterns. Confirm the agent's permission boundary classification is reliable before granting SSH access or webhook execution rights. Get your dedicated infrastructure API key at wisgate.ai/hall/tokens. A developer who follows this guide will have a scoped key and a validated system prompt before their next deployment window.


OpenClaw Configuration


Step 1 — Locate and Open the Configuration File

OpenClaw stores its configuration in a JSON file in your home directory. Open your terminal and edit the file at:

Using nano:

curl
nano ~/.clawdbot/clawdbot.json

Step 2 — Add the WisGate Provider to Your Models Section

Copy and paste the following configuration into the models section of your clawdbot.json. This defines WisGate as a custom provider and registers Claude Opus with your preferred model settings.

json
"models": {
  "mode": "merge",
  "providers": {
    "moonshot": {
      "baseUrl": "https://api.wisgate.ai/v1",
      "apiKey": "YOUR-WISGATE-API-KEY",
      "api": "openai-completions",
      "models": [
        {
          "id": "claude-opus-4-6",
          "name": "Claude Opus 4.6",
          "reasoning": false,
          "input": ["text"],
          "cost": {
            "input": 0,
            "output": 0,
            "cacheRead": 0,
            "cacheWrite": 0
          },
          "contextWindow": 256000,
          "maxTokens": 8192
        }
      ]
    }
  }
}

Note: Replace YOUR-WISGATE-API-KEY with your key from wisgate.ai/hall/tokens. The "mode": "merge" setting adds WisGate's models alongside your existing providers without replacing them. To add additional models, duplicate the model entry block and update the "id" and "name" fields with the correct model IDs from wisgate.ai/models.


Step 3 — Save, Exit, and Restart OpenClaw

If using nano:

  1. Press Ctrl + O to write the file → press Enter to confirm
  2. Press Ctrl + X to exit the editor

Restart OpenClaw:

  1. Press Ctrl + C to stop the current session
  2. Relaunch with:
curl
openclaw tui

Once restarted, the WisGate provider and your configured Claude models will appear in the model selector.

Both infrastructure cases use Opus. This differs from most other OpenClaw categories where Sonnet or Haiku is the correct default. Infrastructure diagnosis is the exception — a wrong diagnostic conclusion from a lower-tier model can worsen the system state beyond the original issue. Confirm Opus pricing from https://wisgate.ai/models and treat this as an operational cost, not an experimentation cost. Full model list: https://wisgate.ai/models

Step 8 — Validate before connecting to live systems Open https://wisgate.ai/studio/image, select claude-opus-4-5, and run your system prompt against synthetic incident scenarios before granting any system access. Validation criteria are defined per case in Sections 4 and 5.

Naming note: OpenClaw was previously known as ClawdBot and MoltBot. References in older documentation may use either name — the configuration steps above apply to all versions.


Infrastructure-Specific Addition: Permission Boundary Block

Every infrastructure agent system prompt must open with a three-zone permission boundary block. This is not optional and is not implied by general instructions — it must be explicit, named, and tested before deployment.

ZoneDefinitionRequired Agent Behavior
AutonomousPre-approved, low-risk, reversible actionsExecute immediately; log action and outcome to STATE
ConfirmMedium-risk or irreversible changesPause execution; send structured alert with proposed action; wait for explicit human approval before proceeding
ProhibitedHigh-risk, data-destructive, or auth-modifying operationsRefuse unconditionally; escalate immediately with full incident context

Populate each zone with specific named actions — not categories. "Restart named services" belongs in Autonomous. "Modify service config files" belongs in Confirm. "Delete user data" belongs in Prohibited. Vague permission language produces vague compliance. The model follows the instructions it is given.

Test boundary classification at https://wisgate.ai/studio/image using edge-case scenarios that deliberately sit on the Autonomous/Confirm line. If the agent misclassifies any scenario, revise the system prompt before proceeding. Do not grant live system access until classification is reliable across at least five distinct edge cases.


AI Infrastructure Automation: Why Always-On Agents Have Different Configuration Requirements

Scheduled digest automations run, produce output, and stop. Infrastructure agents run continuously, hold system access, and act. That difference generates three configuration requirements that do not apply to any other OpenClaw category.

1. Dedicated API key with no shared usage

Sharing an infrastructure agent's key with other automations creates a failure surface: key exhaustion from a noisy automation can take an infrastructure agent offline mid-incident. Key compromise from any shared automation exposes the infrastructure agent's access context. Create one key per agent at https://wisgate.ai/hall/tokens and label each clearly.

2. System prompt hardening — explicit rules, not guidelines

An infrastructure agent's system prompt must specify what the agent is authorized to do, what requires human confirmation, and what it must never attempt — by name, not by category. A prompt that says "be careful with destructive operations" does not produce reliable boundary enforcement. A prompt that says "you are prohibited from modifying SSH configuration under any circumstances; if a scenario appears to require it, escalate immediately" does. The difference in specificity is the difference in reliability.

3. Escalation path defined before deployment

Define the complete escalation path before the agent goes live: the conditions that trigger an alert, the notification channel (Slack webhook, email, SMS), and the agent's behavior while awaiting human response — does it hold? Does it attempt the safest fallback? Does it continue monitoring? An agent without a defined escalation path will either under-react to novel situations (continuing without authorization) or over-react (halting all operations). Neither outcome is acceptable for always-on infrastructure.

Model selection for this category:

Both cases use claude-opus-4-5. This is the correct tier for infrastructure diagnosis — not the default for most OpenClaw automations. The reasoning requirement is specific: the agent must interpret novel log patterns, identify root causes that were not anticipated at setup time, and classify proposed actions against a permission boundary correctly under ambiguous conditions. Confirm Opus pricing from https://wisgate.ai/models and calculate your daily cost at your expected call volume before deployment.

LLM DevOps Agent — Case 1: n8n Workflow Orchestration

What it does: The agent receives natural language task requests, selects the appropriate n8n workflow from a named list, calls the corresponding webhook with the correct parameters, and returns the n8n execution result. The agent never sees, stores, or transmits credentials. All authentication lives inside n8n workflows.

Why this architecture matters: Every direct credential integration with an LLM agent creates an exposure surface. The surface grows with every new integration. The n8n delegation pattern eliminates this: the agent knows webhook URLs, not secrets. Revoking an integration means disabling the n8n workflow — the agent's configuration is unchanged, no keys need rotation, and the audit trail is entirely in n8n's execution log.

Architecture Overview

ComponentRoleAccess Level
OpenClaw + claude-opus-4-5Reasoning and workflow selectionNamed webhook URLs only
n8n workflow engineAPI execution and credential managementFull credential access — isolated from agent
Webhook endpointsInterface between agent and n8nParameterized calls with structured inputs

This is not a workaround — it is a deliberate security boundary at the architecture level. The agent's reasoning capability and the system's execution capability are separated by design.

System Prompt Structure

Block 1 — Identity and constraint:

You are a workflow orchestration agent. You have no direct API access.
All external actions are performed exclusively via the named n8n webhooks listed below.
You may not construct new URLs. You may not request, store, or transmit credentials.
You may not call any endpoint not listed in your webhook inventory.

Block 2 — Webhook inventory:

Available workflows:
- WEBHOOK_SEND_EMAIL: sends an email via configured SMTP; parameters: to, subject, body
- WEBHOOK_CREATE_JIRA_ISSUE: creates a Jira ticket; parameters: project, type, summary, description
- WEBHOOK_POST_SLACK: posts to a Slack channel; parameters: channel, message
- WEBHOOK_RESTART_SERVICE: triggers a named service restart via n8n SSH action; parameters: service_name, server
[Add your specific workflows — one per integration]

Block 3 — Decision logic:

When given a task:
1. Identify which webhook(s) are required
2. Confirm all required parameters are available from the request context
3. Call the webhook with structured parameters
4. Return: which webhook was called, with what parameters, and the n8n response
5. If no listed webhook covers the task, respond: "No available workflow for this task. Escalating."

Block 4 — Permission boundary:

AUTONOMOUS: Call any listed webhook with complete, validated parameters
CONFIRM: Call a webhook that would affect production systems or send external communications to customers
PROHIBITED: Construct new URLs, handle credentials, call unlisted endpoints

Configuration Steps in OpenClaw

  1. Complete the universal WisGate setup from Section 2 with model claude-opus-4-5
  2. In n8n, create one workflow per integration with a Webhook trigger node; record each webhook URL
  3. Populate Block 2 of the system prompt with your specific webhook inventory
  4. Paste the complete four-block system prompt into OpenClaw's system prompt field
  5. Test with a task requiring one workflow; verify n8n logs the execution before adding more
  6. Expand the webhook inventory one workflow at a time — test each before adding the next

The key security property to verify at each step: disabling all n8n workflows immediately revokes the agent's ability to affect any integrated system, without touching the OpenClaw configuration or the WisGate API key. If you can't verify this is true for your configuration, the credential isolation is incomplete.

Cost: Each orchestration task is one Opus call. Confirm per-token pricing from https://wisgate.ai/models. At 10–50 decisions per day, calculate monthly token volume at your average task input/output length and confirm the arithmetic before deployment.

[Link to: Full n8n Workflow Orchestration configuration →]

OpenClaw Use Cases — Case 2: Self-Healing Home Server

What it does: An always-on agent monitors a home server network on a cron schedule via SSH, collects service status, resource utilization, and log data, diagnoses failures against known and novel patterns, executes remediation actions within its Autonomous zone, and sends structured alerts when a situation exceeds authorized scope.

Why developers build this: Home server operators — running self-hosted services, media servers, local network infrastructure, or development environments — spend significant time on routine remediation: restarting crashed services, clearing full temporary directories, renewing certificates before they expire. An always-on agent handles the routine majority autonomously and escalates only the novel issues that require human judgment. Manual intervention frequency is the metric this configuration targets.

Agent Architecture — Four Sequential Roles in One Conversation Loop

RoleFunctionModel
MonitorCollects service status, disk usage, memory, active connections, and targeted log lines via SSHclaude-opus-4-5
DiagnosticianInterprets collected data; identifies root cause; classifies situation by permission zoneclaude-opus-4-5
RemediatorExecutes Autonomous-zone actions; logs action, parameters, and outcome to STATEclaude-opus-4-5
EscalatorFormats structured alert when Confirm or Prohibited zone is reached; routes to notification channelclaude-opus-4-5

Each role runs as a sequential step in one OpenClaw conversation — the output of each step feeds as input context to the next. This is not four separate conversations; it is one multi-turn session with defined hand-off points.

System Prompt Structure — Four Labeled Sections

Section A — Identity and scope: Define the agent's role explicitly: which servers it monitors, which services it is responsible for, and the geographic or network scope of its authority. The more specific this definition, the more reliable the boundary enforcement.

You are the infrastructure monitoring agent for [home network name].
You are responsible for the following servers and services:
- [server-1]: nginx, postgresql, certbot
- [server-2]: plex, sonarr, radarr
- [NAS]: samba, rsync, disk health
You have SSH access to the above. You have no authority over any other system.

Section B — Monitoring checklist: List exactly what the agent checks each cycle, with explicit thresholds. Every item without a defined threshold will be interpreted at the model's discretion — which is not consistent behavior for a production monitoring agent.

Each monitoring cycle, collect and evaluate:
- Service status: [list named services]; flag any with status != active
- Disk usage: flag any volume above 85% used
- Memory: flag if available < 512MB on any monitored server
- Certificate expiry: flag any cert expiring within 14 days
- Log patterns: scan /var/log/syslog and service-specific logs for:
  [OOM killer invoked | connection refused | segfault | failed to start]

Section C — Permission boundary block (the three-zone model): This section must be the most specific in the entire prompt. Name every authorized action individually.

AUTONOMOUS — execute immediately; log action and outcome:
- Restart: nginx, postgresql, plex, sonarr, radarr, samba
- Clear: /tmp directories, application cache directories [list by path]
- Renew: Let's Encrypt certificates via certbot renew

CONFIRM — pause; send alert; await explicit human approval:
- Modify any configuration file
- Stop any service (as opposed to restart)
- Change file permissions
- Kill any user process

PROHIBITED — refuse unconditionally; escalate with full context:
- Delete any file outside defined cache paths
- Modify SSH configuration or authorized_keys
- Change firewall rules or iptables
- Create or remove user accounts

Section D — Escalation format: Every alert must contain a fixed set of fields so the receiving developer has full context without needing to interrogate the agent.

When escalating, include exactly:
- Timestamp (ISO 8601)
- Affected server and service
- Root cause diagnosis
- Actions already attempted and outcomes
- Proposed next action (if any within Confirm zone)
- Confidence score: 1 (uncertain) to 5 (high confidence)
- If confidence < 3: request specific additional context from developer before proceeding

Route all alerts to: [SLACK_WEBHOOK_URL or EMAIL_ENDPOINT]

Configuration Steps in OpenClaw

  1. Complete the universal WisGate setup from Section 2 with model claude-opus-4-5
  2. Populate all four system prompt sections with your specific servers, services, thresholds, and escalation channel
  3. Test Diagnostician section first: input a synthetic log snippet (e.g., OOM killer invocation during peak memory) and verify correct zone classification
  4. Test Remediator section: verify the agent proposes only Autonomous-zone actions without prompting for Confirm-zone scenarios
  5. Test Escalator section: verify every alert contains all required fields and routes to the correct channel
  6. Grant SSH access only after all three test categories pass across at least five distinct scenarios each
  7. Set cron schedule: 5-minute cycles for service status, 15-minute cycles for resource metrics, 6-hour cycles for certificate checks

The specific capability that justifies Opus for this workload: A scripted monitoring system checks predefined conditions against predefined thresholds and runs predefined commands. It fails silently for anything outside its defined rule set. The LLM agent diagnoses novel failure patterns — unexpected service interactions, emergent resource contention, log signatures from conditions that weren't anticipated when the monitoring script was written. This adaptive diagnostic capability is what claude-opus-4-5 provides that lower tiers do not provide reliably. It is the reason this use case is in the Infrastructure category and not the Productivity category.

Cost: Each monitoring cycle is one multi-turn Opus conversation (typically 3–5 turns covering Monitor → Diagnose → Remediate or Escalate). Confirm Opus per-token pricing from https://wisgate.ai/models. At 5-minute intervals: 288 monitoring cycles/day. Calculate your daily token volume at your average monitoring data size and confirm the arithmetic before setting the cron schedule.

[Link to: Full Self-Healing Home Server configuration →]

OpenClaw Use Cases: Infrastructure Category — Model and Cost Reference

CaseModelEst. Calls/DayCall TypeRationale
n8n Workflow Orchestrationclaude-opus-4-510–50 decisionsSingle-turn per taskWrong workflow selection triggers irreversible n8n actions; incorrect parameters can corrupt external state
Self-Healing Home Serverclaude-opus-4-5288 cycles (5-min)Multi-turn per cycle (3–5 turns)Novel failure diagnosis; wrong root cause classification worsens incident; boundary misclassification causes unauthorized action

Annual cost projection (confirm all pricing from https://wisgate.ai/models before finalizing):

CaseDaily CallsMonthly CallsAnnual CallsAnnual Cost
n8n Orchestration10–50300–1,5003,650–18,250Confirm rate × volume
Self-Healing Server288 cycles × 4 avg turns = ~1,152 turns~34,560 turns~414,720 turnsConfirm rate × volume

State confirmed arithmetic once pricing is verified. Present both as infrastructure operational costs — the same category as a monitoring SaaS subscription — not as API experiment costs. The break-even calculation is the agent's cost versus the hourly rate of manual incident response at your actual incident frequency.

Dedicated key reminder: Create a separate WisGate key for each agent at https://wisgate.ai/hall/tokens. Label each descriptively. Review key usage in the WisGate dashboard monthly — an unexpected spike in call volume from an infrastructure agent is itself a diagnostic signal.


OpenClaw Use Cases: Infrastructure & DevOps — What to Configure First

Two complete always-on agent configurations for OpenClaw via WisGate. Both use claude-opus-4-5. Both require dedicated API keys, explicit permission boundary blocks with named actions per zone, and validated escalation paths before connecting to live systems.

The single most important pre-deployment step: Test permission boundary classification in AI Studio before granting any system access. Input at least five edge-case scenarios that deliberately sit on the Autonomous/Confirm boundary — for example, "the nginx service has been restarting repeatedly for 20 minutes" (should be Confirm, not Autonomous after repeated attempts). If the agent classifies all five correctly, the permission block is working. If it misclassifies any, revise the specific zone definition that failed before proceeding.

Which case to start with: Start with n8n Workflow Orchestration if you already have n8n infrastructure — the credential isolation pattern makes it the lower-risk initial deployment, and the webhook model is easy to audit and extend. Start with Self-Healing Home Server if manual server maintenance is your highest recurring time cost and you want the fastest return on configuration investment.

For the complete 36-case OpenClaw library across all 6 categories, return to the [OpenClaw Use Cases pillar page →].


Your dedicated infrastructure key is one step away. Go to wisgate.ai/hall/tokens, create a labeled key for your first infrastructure agent, and test your system prompt boundary classification at wisgate.ai/studio/image before connecting to any live system. Trial credits are included — no commitment before the first validated agent call. Both configurations in this guide are complete and ready to implement: pick the case that addresses your most immediate operational pain, populate the system prompt sections with your specific services and thresholds, and run the boundary classification validation. The agent connects to live infrastructure only after that validation passes.


All per-token cost figures require confirmation from wisgate.ai/models before publication. Insert confirmed Opus rates into all cost tables before this article goes live. Model pricing is subject to change. Security configurations in this article represent recommended practice for the described use cases — developers are responsible for validating all permission boundaries against their specific system environments before deployment.

OpenClaw Infrastructure & DevOps Use Cases: 2 Always-On Agent Configurations for Self-Healing Systems | JuheAPI