Back to datasets
Dataset assetOpen Source CommunityChatbotCustomer Support

audichandra/bitext_customer_support_llm_dataset_indonesian

This dataset is the Bitext dataset translated into Indonesian using the Helsinki‑NLP/opus‑mt‑en‑id model. The original Bitext dataset is primarily used for training customer‑support LLM chatbots.

Source
hugging_face
Created
Nov 28, 2025
Updated
Mar 3, 2024
Signals
60 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Base Dataset

  • Name: Bitext Customer Support LLM Chatbot Training Dataset
  • Source: Bitext

Translation Information

  • Target Language: Indonesian
  • Translation Model: Helsinki‑NLP/opus‑mt‑en‑id

Citation Information

  • OPUS‑MT Model: bash @InProceedings{TiedemannThottingal:EAMT2020, author = {J{"o}rg Tiedemann and Santhosh Thottingal}, title = {{OPUS-MT} — {B}uilding open translation services for the {W}orld}, booktitle = {Proceedings of the 22nd Annual Conferenec of the European Association for Machine Translation (EAMT)}, year = {2020}, address = {Lisbon, Portugal} }

  • Bitext Dataset: bash @misc{bitext_chatbot_dataset, title={Bitext Customer Support LLM Chatbot Training Dataset}, author={{Bitext}}, year={2023}, howpublished={url{https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset}} }

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio