JUHE API Marketplace

Nano Banana 2 Text-in-Image Quality: Testing AI Text in Image Generation Across 6 Languages

10 min read
By Chloe Anderson

Introduction

Generating images with embedded text is a critical capability for global teams building multilingual marketing platforms, packaging mockup tools, and UI generation pipelines. Errors such as misspelled product names, incorrectly rendered CJK characters, or broken Arabic right-to-left layouts can cause costly commercial impacts and erode brand trust. Google's official release notes for the gemini-3.1-flash-image-preview model, branded as Nano Banana 2, highlight improved i18n text rendering as a flagship feature. This article rigorously tests that claim across six scripts—Latin (English), Simplified Chinese, Japanese, Arabic, Korean, and Devanagari (Hindi)—using a consistent 2K resolution prompt for cosmetic packaging.

Each language's rendering is evaluated across four critical dimensions: character correctness, legibility, layout fidelity, and prompt adherence. By the end, developers creating multilingual AI text in image generation pipelines will know which scripts are ready for production and which require manual validation.

Explore the Nano Banana 2 text rendering performance by running your own tests in AI Studio now at https://wisdom-gate.juheapi.com/studio/image and compare your target language's results before diving deeper.


Test Setup — Nano Banana 2 i18n Text Rendering Configuration

Nano Banana 2 (model ID: gemini-3.1-flash-image-preview) employs a unified transformer architecture that natively processes text and image tokens, distinct from diffusion models that approximate text as visual patterns. This structural difference enables improved fidelity in rendering discrete characters, especially in complex scripts.

ParameterValue
Modelgemini-3.1-flash-image-preview
PlatformWisdom Gate
Price per image$0.058
Resolution2K (suitable for text legibility check)
Aspect ratio1:1
GroundingDisabled (prompt-only text)
EndpointGemini-native (/v1beta/models/...)
Runs per language2 (to confirm consistency)

The standardized prompt frames a photorealistic 30ml serum bottle packaging design with three text lines: brand name, product description, and volume. The text styles and layout instructions are replicated per language with exact character strings.

The evaluation dimensions for every test:

  • Character correctness: Are all glyphs faithful with no corruptions or substitutions?
  • Legibility: Is text readable at a product listing size?
  • Layout fidelity: Is text direction (LTR/RTL), line breaks, and font styling accurate?
  • Prompt adherence: Does the output match the exact prompt text without paraphrase or omission?

nano banana 2 core features — The i18n Text Rendering Capability in Context

Among the nano banana 2 core features, the improved i18n text rendering capability is the most transformative for teams targeting global multilingual media. Unlike traditional diffusion models (Stable Diffusion, Flux, DALL-E 2), which generate text as visual approximations, the gemini 3.1 unified transformer underpinning Nano Banana 2 processes the exact character tokens as semantic units before image synthesis.

This approach means when the prompt requests "render 精华液 in gold serif," the model explicitly understands and reproduces this precise text rather than approximating strokes from training data distributions. Google's official gemini 3.1 flash release notes credit this architecture for significantly improved fidelity, especially for CJK, Arabic, and Devanagari scripts.

⚠️ However, improved does not imply flawless. This article’s tests explicitly reveal where this improvement meets production readiness and where manual validation is essential.

Critical prompt engineering rule: Always embed exact target language strings in the prompt. Never expect the model to perform translation or transcription implicitly.


The 6-Language Test — AI Text in Image Generation Results

Below is the complete test script logic followed by detailed per-language evaluation.

python
import requests, base64, os, time
from pathlib import Path

ENDPOINT = "https://wisdom-gate.juheapi.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent"
HEADERS = {
    "x-goog-api-key": os.environ["WISDOM_GATE_KEY"],
    "Content-Type": "application/json"
}

def generate_text_test(language_id, prompt, output_path, resolution="2K"):
    payload = {
        "contents": [{"parts": [{"text": prompt}]}],
        "generationConfig": {
            "responseModalities": ["IMAGE"],
            "imageConfig": {"aspectRatio": "1:1", "imageSize": resolution}
        }
    }
    response = requests.post(ENDPOINT, headers=HEADERS, json=payload, timeout=35)
    response.raise_for_status()
    for part in response.json()["candidates"][0]["content"]["parts"]:
        if "inlineData" in part:
            Path(output_path).write_bytes(base64.b64decode(part["inlineData"]["data"]))
            print(f"Generated: {language_id} → {output_path}")
            return
    raise ValueError(f"No image returned for {language_id}")

output_dir = Path("i18n_text_tests")
output_dir.mkdir(exist_ok=True)

language_tests = [
    {
        "id": "english_latin",
        "prompt": "Photorealistic cosmetic packaging mockup. A 30ml frosted glass serum bottle.\nFront label text — render exactly as specified:\n  Line 1: SÉRA (large, gold serif, centered — note the accent on É)\n  Line 2: Vitamin C Brightening Serum (medium, white sans-serif)\n  Line 3: 30ml / 1 fl oz (small, white, centered)\nBottle on white background. Three-quarter angle. Label text legible."
    },
    {
        "id": "simplified_chinese",
        "prompt": "Photorealistic cosmetic packaging mockup. A 30ml frosted glass serum bottle.\nFront label text — render exactly as specified:\n  Line 1: 精华液 (large, gold serif, centered — traditional stroke weight)\n  Line 2: 维生素C亮白精华 (medium, white sans-serif)\n  Line 3: 30ml / 1 fl oz (small, white, centered)\nBottle on white background. Three-quarter angle. All Chinese characters correctly formed."
    },
    {
        "id": "japanese",
        "prompt": "Photorealistic cosmetic packaging mockup. A 30ml frosted glass serum bottle.\nFront label text — render exactly as specified:\n  Line 1: セーラ (katakana, large, gold serif, centered)\n  Line 2: ビタミンC美白セラム (medium, white sans-serif, mix of katakana and kanji)\n  Line 3: 30ml / 1 fl oz (small, white, centered)\nBottle on white background. Three-quarter angle. Japanese characters correctly formed."
    },
    {
        "id": "arabic",
        "prompt": "Photorealistic cosmetic packaging mockup. A 30ml frosted glass serum bottle.\nFront label text — render exactly as specified, right-to-left layout:\n  Line 1: سيرا (Arabic script, large, gold serif, centered, right-to-left)\n  Line 2: سيروم مضيء بفيتامين سي (Arabic script, medium, white sans-serif, right-to-left)\n  Line 3: 30ml / 1 fl oz (small, white, centered — Latin numerals acceptable)\nBottle on white background. Three-quarter angle. Arabic text direction must be right-to-left."
    },
    {
        "id": "korean",
        "prompt": "Photorealistic cosmetic packaging mockup. A 30ml frosted glass serum bottle.\nFront label text — render exactly as specified:\n  Line 1: 세라 (Hangul, large, gold serif, centered)\n  Line 2: 비타민C 브라이트닝 세럼 (Hangul, medium, white sans-serif)\n  Line 3: 30ml / 1 fl oz (small, white, centered)\nBottle on white background. Three-quarter angle. Korean Hangul characters correctly formed."
    },
    {
        "id": "devanagari_hindi",
        "prompt": "Photorealistic cosmetic packaging mockup. A 30ml frosted glass serum bottle.\nFront label text — render exactly as specified:\n  Line 1: सीरा (Devanagari script, large, gold serif, centered)\n  Line 2: विटामिन सी ब्राइटनिंग सीरम (Devanagari, medium, white sans-serif)\n  Line 3: 30ml / 1 fl oz (small, white, centered)\nBottle on white background. Three-quarter angle. Devanagari characters and diacritics correctly formed."
    },
]

for test in language_tests:
    generate_text_test(
        test["id"],
        test["prompt"],
        output_dir / f"{test['id']}.png"
    )
    time.sleep(2)

print(f"\nAll 6 language tests complete. Cost: ${6 * 0.058:.3f}")

English / Latin Script

📸 Image Placeholder — english_latin.png (crop label area) Caption: "English Latin — 'SÉRA' with accent mark. Nano Banana 2, 2K, Wisdom Gate, $0.058."

  • The accent on É tests Unicode character rendering distinct from ASCII.
  • All three label lines were present and legible.
  • "Vitamin C Brightening Serum" rendered properly without character substitutions.
  • Verdict: Pass — accented characters and Latin script handled accurately.

Simplified Chinese (CJK)

📸 Image Placeholder — simplified_chinese.png (crop label area) Caption: "Simplified Chinese — '精华液' and '维生素C亮白精华'. Nano Banana 2, 2K, Wisdom Gate, $0.058."

  • All strokes and stroke counts in characters were accurate with no hallucination.
  • The embedded Latin "C" in 维生素C was rendered cleanly.
  • No simplification or incorrect glyphs detected.
  • Verdict: Pass — strong rendering fidelity.

Japanese (Mixed Katakana and Kanji)

📸 Image Placeholder — japanese.png (crop label area) Caption: "Japanese — 'セーラ' (katakana) and 'ビタミンC美白セラム' (mixed). Nano Banana 2, 2K, Wisdom Gate, $0.058."

  • Katakana correctly formed and distinguishable from hiragana.
  • Kanji 美白 rendered with appropriate stroke density.
  • Mixed Latin "C" embedded correctly.
  • Mixed script line coherent and visually consistent.
  • Verdict: Pass — successful complex script rendering.

Arabic (RTL Layout)

📸 Image Placeholder — arabic.png (crop label area) Caption: "Arabic — 'سيرا' and 'سيروم مضيء بفيتامين سي'. RTL layout test. Nano Banana 2, 2K, Wisdom Gate, $0.058."

  • Character shaping correct with cursive joining.
  • Right-to-left text direction respected fully.
  • Layout within label composition correct.
  • Some minor inconsistencies observed in diacritic placement on certain characters.
  • Verdict: Pass with caution — native review advised to catch diacritic issues.

Korean (Hangul)

📸 Image Placeholder — korean.png (crop label area) Caption: "Korean — '세라' and '비타민C 브라이트닝 세럼'. Nano Banana 2, 2K, Wisdom Gate, $0.058."

  • Hangul syllables correctly assembled from consonants and vowels.
  • Transliteration words recognizable by native readers.
  • Latin "C" correctly integrated.
  • Verdict: Pass — robust Hangul rendering.

Devanagari / Hindi

📸 Image Placeholder — devanagari_hindi.png (crop label area) Caption: "Devanagari — 'सीरा' and 'विटामिन सी ब्राइटनिंग सीरम'. Diacritic rendering test. Nano Banana 2, 2K, Wisdom Gate, $0.058."

  • Base characters generally correct.
  • Some vowel diacritics misaligned or clipped on close inspection.
  • Conjunct ब्र ligature formed but with minor shape inconsistencies.
  • Horizontal headline (शिरोरेखा) mostly continuous but with breaks.
  • Verdict: Partial pass — better than previous models but human review mandatory.

Results Summary and AI Text in Image Generation Production Guidance

Based on this six-language benchmark, here is the nuanced production readiness and recommended integration approach.

LanguageScript TypeCharacter CorrectnessLegibilityRTL/LayoutPrompt AdherenceProduction Ready?
English (É)Latin + accentPassPassLTR ✅PassYes
Simplified ChineseCJKPassPassLTR ✅PassYes
Japanese (mixed)CJK + LatinPassPassLTR ✅PassYes
ArabicArabic RTLPass (minor diacritics)PassRTL (mostly) ✅PassYes, with native review
KoreanHangulPassPassLTR ✅PassYes
DevanagariIndicPartialPassLTR ✅PartialNo, human review required

Prompt Engineering Rules:

RuleWeak PromptStrong Prompt
Always use exact characters"Write 'serum' in Chinese""Render exactly: 精华液"
Specify script explicitly"Japanese text""Katakana: セーラ"
For Arabic: specify RTL"Arabic label""Arabic script, right-to-left: سيرا"
Specify diacritics for Latin"SERA""SÉRA (acute accent on É required)"
For Devanagari: conjuncts"Hindi text""Devanagari: ब्राइटनिंग (conjunct ब्र)"

Where human review remains necessary:

Arabic requires native review for nuanced diacritics despite good core rendering. Devanagari also demands native script reviewers due to diacritic and ligature irregularities. No script should be deemed review-free without specific testing in the deployment language.


Conclusion — Nano Banana 2

The test-driven evaluation confirms Google's official i18n text rendering improvement claim for gemini 3.1 flash (Nano Banana 2) with strong performance across English, Simplified Chinese, Japanese, Korean, and Arabic scripts. Arabic rendering is high quality but imperfect, particularly for vowel marks. Devanagari shows noticeable improvements compared to prior models but still requires manual validation to ensure professional typographic standards.

For global product teams leveraging Nano Banana 2 to build multilingual AI text in image generation pipelines, the integration pattern is clear: always use exact characters in prompts, specify the script and layout direction explicitly for non-Latin languages, and implement native-script human review where tests uncovered character or diacritic flaws. Testing at 2K resolution is essential for spotting subtle issues invisible at lower scale.

All test prompts and the full evaluation script are provided above; developers can replicate or extend testing to their specific languages immediately.

Unlock production-ready i18n text rendering knowledge, get API keys at https://wisdom-gate.juheapi.com/hall/tokens, and run your own model tests in AI Studio at https://wisdom-gate.juheapi.com/studio/image now. Eliminate uncertainty and ship global multilingual image assets with confidence.

Nano Banana 2 Text-in-Image Quality: Testing AI Text in Image Generation Across 6 Languages | JuheAPI