JUHE API Marketplace
DATASET
Open Source Community

MC-EIU

The MC‑EIU dataset, created by Inner Mongolia University and partner institutions, is a comprehensive multimodal dialogue dataset for joint emotion and intent understanding. It contains 4,970 dialogue video clips (56,012 utterances) covering 7 emotions and 9 intents, supporting text, acoustic, and visual modalities in both English and Mandarin. The dataset was built through data collection, preprocessing, and multi‑round annotation to ensure quality and diversity. MC‑EIU is aimed at human‑computer interaction research, enhancing machine understanding of human needs and empathy in conversational systems.

Updated 7/4/2024
arXiv

Description

MC‑EIU Dataset Analysis

Dataset Download

  • Baidu Cloud Link: Link
  • Extraction Code: Obtain after paper acceptance via email to the authors.

Dataset Analysis

Data Visualization

  • Figure 1: Visualisation of the correlation between emotions and intents in the MC‑EIU dataset. Each circle represents the sample count for a specific "emotion‑intent" pair; larger circles indicate more samples and stronger correlation.

Correlation Analysis

  • Datasets: MC‑EIU‑English and MC‑EIU‑Mandarin
  • Matrix Representation: Two 7 × 9 matrices where each element indicates the sample count for an "emotion‑intent" pair.
  • Visualization Method: Circle radius proportional to sample count, plotted at the corresponding matrix position.

Observations

  • Emotion‑Intent Relationship: Not strictly one‑to‑one. Different intents affect specific emotions to varying degrees and vice versa.
    • For example, "Hap‑Sym" appears less frequently than "Hap‑Agr", suggesting that the "Agreeing" intent more often drives the expression of happiness.
  • Dataset Differences: The English subset shows more complex emotion‑intent correlations than the Mandarin subset.
    • For instance, the "Sur" emotion is linked to all intent categories in the English data, while in Mandarin it is associated with only six intents ("Que", "Agr", "Con", "Sug", "Wis", and "Neu").
  • Model Performance: Because of this complexity, models perform slightly worse on the English dataset compared with the Mandarin dataset.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Human‑Computer Interaction
Emotion and Intent Analysis

Source

Organization: arXiv

Created: 7/3/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.