Build a Mini AI Agent with Nano Banana Pro API (Step-By-Step)

Why a Mini Agent and What You’ll Build

Developers often need agents that reason through steps and call a few tools, but don’t want heavyweight infrastructure. In this guide, you’ll build a lightweight, production-ready mini agent using the Nano Banana Pro API. It:

Reasons through multi-step tasks
Calls external tools (HTTP, calculators, simple data lookups)
Runs in a tight loop with clear stop rules
Stays low-cost and fast with minimal dependencies

By the end, you’ll have a ready-to-adapt template in JavaScript and Python using the model gemini-3-pro-image-preview via the base URL https://wisdom-gate.juheapi.com/v1.

Who This Is For

Engineers shipping agents on limited budgets
Teams needing predictable latency and cost
Builders who prefer simple APIs over heavy orchestration

What You’ll Need

An API key for Nano Banana Pro (call the Chat Completions endpoint)
Familiarity with HTTP requests
Optional: a simple HTTP tool (weather, calendar, fetch)

Architecture Overview: Reasoner + Tool Use

A minimal agent pattern divides the work into four small parts:

Reasoner (Nano Banana model): Produces next step, requests tool calls, and assembles final answers.
Tool Runner: Executes approved tools (HTTP GET/POST, simple math, local data).
Memory: Short context window (e.g., recent turns and tool results). Keep memory tiny.
Policy/Loop Controller: Controls iteration count, validates tool requests, and decides when to stop.

Message Protocol (Chat Completions)

We’ll use the chat/completions endpoint with a compact message list. The agent loop sends:

A system prompt that defines the role and tool-call convention
User input
Assistant thoughts and tool requests
Tool results (appended into the history)

Cost and Latency Considerations

Keep the message history small (recent 5–10 turns).
Prefer concise tool results (truncate or summarize large responses).
Cap the loop to 2–4 iterations for most tasks.
Cache repeated tool calls and reuse IDs.

Setup and First Call

Base URL: https://wisdom-gate.juheapi.com/v1 Model: gemini-3-pro-image-preview Endpoint: /chat/completions

Quick cURL Smoke Test

This mirrors the official style and validates your key:

curl --location --request POST 'https://wisdom-gate.juheapi.com/v1/chat/completions' \
--header 'Authorization: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--header 'Accept: */*' \
--header 'Host: wisdom-gate.juheapi.com' \
--header 'Connection: keep-alive' \
--data-raw '{
  "model":"gemini-3-pro-image-preview",
  "messages": [
    { "role": "user", "content": "Draw a stunning sea world." }
  ]
}'

Minimal Text Reasoning Call (cURL)

curl --location --request POST 'https://wisdom-gate.juheapi.com/v1/chat/completions' \
--header 'Authorization: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model":"gemini-3-pro-image-preview",
  "messages": [
    {"role":"system","content":"You are a concise assistant. Answer in 2–4 sentences."},
    {"role":"user","content":"Summarize why small agents are useful."}
  ]
}'

Agent Pattern: Reasoning + Tool Use

To stay lightweight, we’ll avoid complex function-calling frameworks. Instead, we’ll use a simple convention: the assistant proposes tool calls as JSON in a single line. The controller detects it, runs the tool, and appends the result back to the conversation.

Tool-Call Convention

When the assistant wants a tool, it outputs a compact JSON object on a single line, e.g.:
- {"tool":"http_get","url":"https://api.example.com/weather?city=Boston"}
When ready to answer, it outputs a final_answer on a single line:
- {"final_answer":"..."}

System Prompt Template

Keep it explicit and strict:

You are a reasoning assistant. If a tool is needed, output a single JSON line with keys: tool and input fields. Examples:
- {"tool":"http_get","url":"https://api.example.com?q=..."}
- {"tool":"calc","expression":"2+2"}
When you have enough information, output a single JSON line: {"final_answer":"..."}
Never include extra commentary when emitting JSON lines.

Control Flow (High-Level)

Send user task and system prompt to the model.
If assistant returns a tool JSON:
- Validate it against the allowlist.
- Execute the tool.
- Append the tool result to messages as a new assistant-visible context.
Repeat for up to N steps.
Return final_answer once produced.

Full Template: JavaScript (Node)

This minimal agent fits in a single file. It uses fetch and a tiny tool registry.

import fetch from 'node-fetch';

const BASE_URL = 'https://wisdom-gate.juheapi.com/v1/chat/completions';
const MODEL = 'gemini-3-pro-image-preview';
const API_KEY = process.env.NANO_BANANA_API_KEY;

const SYSTEM_PROMPT = `You are a reasoning assistant.\nIf a tool is needed, output one JSON line:\n{"tool":"http_get","url":"..."}\n{"tool":"calc","expression":"..."}\nWhen done, output one JSON line: {"final_answer":"..."}\nOnly emit JSON lines for tool calls or final answers.`;

// Simple tools
async function http_get(url) {
  const res = await fetch(url, { method: 'GET' });
  const text = await res.text();
  // Keep response compact
  return text.slice(0, 2000);
}

function calc(expr) {
  try {
    // Extremely simple evaluator; replace with a safe parser in production
    // Allowed digits, operators, parentheses only
    if (!/^[-+*/(). 0-9]+$/.test(expr)) throw new Error('Invalid');
    // eslint-disable-next-line no-eval
    const out = eval(expr);
    return String(out);
  } catch (e) {
    return 'calc_error';
  }
}

const TOOL_ALLOWLIST = new Set(['http_get', 'calc']);

async function callLLM(messages) {
  const body = { model: MODEL, messages };
  const res = await fetch(BASE_URL, {
    method: 'POST',
    headers: {
      'Authorization': API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(body)
  });
  const json = await res.json();
  // Adjust parsing to match API response structure
  const assistantMsg = json?.choices?.[0]?.message?.content || json?.choices?.[0]?.text || '';
  return assistantMsg;
}

function isToolJSON(s) {
  try {
    const obj = JSON.parse(s);
    return obj && obj.tool && TOOL_ALLOWLIST.has(obj.tool);
  } catch { return false; }
}

function isFinalJSON(s) {
  try {
    const obj = JSON.parse(s);
    return obj && typeof obj.final_answer === 'string';
  } catch { return false; }
}

async function runTool(obj) {
  if (obj.tool === 'http_get') return await http_get(obj.url);
  if (obj.tool === 'calc') return calc(obj.expression);
  return 'unsupported_tool';
}

export async function miniAgent(userTask) {
  const messages = [
    { role: 'system', content: SYSTEM_PROMPT },
    { role: 'user', content: userTask }
  ];

  for (let step = 0; step < 4; step++) {
    const out = await callLLM(messages);
    if (isFinalJSON(out)) {
      const { final_answer } = JSON.parse(out);
      return final_answer;
    }
    if (isToolJSON(out)) {
      const obj = JSON.parse(out);
      const toolResult = await runTool(obj);
      // Append tool result succinctly
      messages.push({ role: 'assistant', content: `TOOL_RESULT: ${toolResult.slice(0, 1500)}` });
      continue;
    }
    // If model gave prose, append and continue one more step
    messages.push({ role: 'assistant', content: out.slice(0, 1000) });
  }

  return 'Sorry, I could not complete the task within the step limit.';
}

// Example usage
(async () => {
  const answer = await miniAgent('Find today\'s temp in Boston and say if it\'s good for a run.');
  console.log(answer);
})();

Full Template: Python

A Python variant with requests and the same tool convention.

import os
import json
import requests

BASE_URL = 'https://wisdom-gate.juheapi.com/v1/chat/completions'
MODEL = 'gemini-3-pro-image-preview'
API_KEY = os.getenv('NANO_BANANA_API_KEY')

SYSTEM_PROMPT = (
  'You are a reasoning assistant. '\n
  'If a tool is needed, output one JSON line: '\n
  '{"tool":"http_get","url":"..."} '\n'
  '{"tool":"calc","expression":"..."} '\n'
  'When done, output one JSON line: {"final_answer":"..."} '\n'
  'Only emit JSON lines for tool calls or final answers.'
)

ALLOWED_TOOLS = {'http_get', 'calc'}

def http_get(url: str) -> str:
  r = requests.get(url, timeout=8)
  return r.text[:2000]

def calc(expr: str) -> str:
  try:
    if not all(c in '0123456789+-*/(). ' for c in expr):
      raise ValueError('Invalid')
    return str(eval(expr))
  except Exception:
    return 'calc_error'

def call_llm(messages):
  payload = { 'model': MODEL, 'messages': messages }
  r = requests.post(
    BASE_URL,
    headers={ 'Authorization': API_KEY, 'Content-Type': 'application/json' },
    json=payload,
    timeout=15
  )
  data = r.json()
  return (
    data.get('choices', [{}])[0].get('message', {}).get('content') or
    data.get('choices', [{}])[0].get('text', '')
  )

def is_tool_json(s: str) -> bool:
  try:
    obj = json.loads(s)
    return 'tool' in obj and obj['tool'] in ALLOWED_TOOLS
  except Exception:
    return False

def is_final_json(s: str) -> bool:
  try:
    obj = json.loads(s)
    return 'final_answer' in obj and isinstance(obj['final_answer'], str)
  except Exception:
    return False

def run_tool(obj: dict) -> str:
  if obj['tool'] == 'http_get':
    return http_get(obj.get('url', ''))
  if obj['tool'] == 'calc':
    return calc(obj.get('expression', ''))
  return 'unsupported_tool'

def mini_agent(user_task: str) -> str:
  messages = [
    { 'role': 'system', 'content': SYSTEM_PROMPT },
    { 'role': 'user', 'content': user_task }
  ]

  for _ in range(4):
    out = call_llm(messages)
    if is_final_json(out):
      return json.loads(out)['final_answer']
    if is_tool_json(out):
      obj = json.loads(out)
      result = run_tool(obj)
      messages.append({ 'role': 'assistant', 'content': f'TOOL_RESULT: {result[:1500]}' })
      continue
    messages.append({ 'role': 'assistant', 'content': out[:1000] })

  return 'Sorry, I could not complete the task within the step limit.'

if __name__ == '__main__':
  print(mini_agent("Check the forecast for Boston and advise a run plan."))

Example: Weather + Calendar Agent

Goal: Ask the agent whether to run today based on weather and a simple calendar. The agent will call http_get on a weather API and then produce the plan.

Prompting Tips

Be explicit about tool access: “You may use http_get for weather.”
Include a small calendar snippet in the user prompt.

Sample Interaction

User: “Here’s my week: M/W/F are free. Use http_get to check Boston weather and suggest a 30-min run.”

Possible assistant turn 1:

{"tool":"http_get","url":"https://api.example.com/weather?city=Boston&units=imperial"}

Controller runs tool and appends:

assistant: TOOL_RESULT: {"today":{"temp":48,"precip":0.1,"wind":7}}

Next assistant turn 2:

{"final_answer":"Temp ~48F, light wind. Run on Wed afternoon; dress warm and hydrate. 30 min easy pace."}

Image-Aware Requests with gemini-3-pro-image-preview

The model supports image-oriented prompts and can reason about visual briefs. For example, request a sea-themed concept image or visual plan for a route.

Simple Visual Brief Request

curl --location --request POST 'https://wisdom-gate.juheapi.com/v1/chat/completions' \
--header 'Authorization: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model":"gemini-3-pro-image-preview",
  "messages": [
    {"role":"system","content":"You create concise visual briefs and scene descriptions."},
    {"role":"user","content":"Draw a stunning sea world with coral gardens, rays, and bioluminescent light."}
  ]
}'

Safety, Performance, and Cost Tips

Safety

Allowlist tools and validate URLs (domain, scheme).
Limit calc to numeric expressions; avoid arbitrary eval.
Hard-cap iterations and response sizes.

Performance

Trim tool results aggressively.
Cache common HTTP responses for the session.
Keep prompts short; the system prompt should be compact.

Cost Control

Use small histories (5–10 turns max).
Prefer one tool call followed by final answer for most tasks.
Log token counts and prune long chains.

Testing, Deployment, and Ops

Local Tests

Unit test the tool parser (isToolJSON, isFinalJSON).
Mock tool responses to ensure loop stability.
Verify that invalid tool JSON yields no execution.

Observability

Record each loop iteration: model output, tool used, duration.
Add guardrail logs (why a tool call was rejected).

Deployment

Serverless function or small container works well.
Set timeouts (8–12s per external HTTP call).
Use environment variables for secrets and simple rotation.

Troubleshooting

Model returns prose instead of JSON: keep an additional assistant message reminding the JSON-only convention.
Tool results are too long: summarize or truncate to 1–2 KB.
Agent loops endlessly: enforce a 3–4 step limit and return a friendly fallback.
Headers/auth errors: verify Authorization header and JSON body.

FAQs

Does the API support function-calling natively?

This template avoids relying on specialized function-calling. You can still achieve tool use by instructing the model to emit minimal JSON commands, then execute them in your controller.

Can I stream outputs?

If streaming is available, you can adapt callLLM to consume event streams. If not, the loop still works with standard responses.

Is gemini-3-pro-image-preview suitable for text reasoning?

Yes, you can use it for concise text reasoning and visual briefs. For text-heavy workflows, keep prompts compact and loop count small.

What about memory persistence?

For a mini agent, stick to per-session memory only. For multi-session memory, store short summaries of past interactions and inject them as a single user-provided context message.

Copy-Paste Checklist

Define a strict system prompt with the tool-call JSON convention.
Implement a small tool registry (http_get, calc, etc.).
Keep message history short and truncate tool results.
Limit to 3–4 loop iterations.
Validate tool requests against an allowlist.
Return final_answer as the only output to the calling app.

Final Notes

You now have a compact agent pattern using Nano Banana Pro API: concise reasoning, simple tool use, and predictable cost. The controller ensures the model’s outputs remain structured, while the tools are minimal and safe. This foundation is easy to expand with more tools, better validation, or small RAG snippets without adding heavy orchestration or compute costs.