MC-EIU
The MC‑EIU dataset, created by Inner Mongolia University and partner institutions, is a comprehensive multimodal dialogue dataset for joint emotion and intent understanding. It contains 4,970 dialogue video clips (56,012 utterances) covering 7 emotions and 9 intents, supporting text, acoustic, and visual modalities in both English and Mandarin. The dataset was built through data collection, preprocessing, and multi‑round annotation to ensure quality and diversity. MC‑EIU is aimed at human‑computer interaction research, enhancing machine understanding of human needs and empathy in conversational systems.
Description
MC‑EIU Dataset Analysis
Dataset Download
- Baidu Cloud Link: Link
- Extraction Code: Obtain after paper acceptance via email to the authors.
Dataset Analysis
Data Visualization
- Figure 1: Visualisation of the correlation between emotions and intents in the MC‑EIU dataset. Each circle represents the sample count for a specific "emotion‑intent" pair; larger circles indicate more samples and stronger correlation.
Correlation Analysis
- Datasets: MC‑EIU‑English and MC‑EIU‑Mandarin
- Matrix Representation: Two 7 × 9 matrices where each element indicates the sample count for an "emotion‑intent" pair.
- Visualization Method: Circle radius proportional to sample count, plotted at the corresponding matrix position.
Observations
- Emotion‑Intent Relationship: Not strictly one‑to‑one. Different intents affect specific emotions to varying degrees and vice versa.
- For example, "Hap‑Sym" appears less frequently than "Hap‑Agr", suggesting that the "Agreeing" intent more often drives the expression of happiness.
- Dataset Differences: The English subset shows more complex emotion‑intent correlations than the Mandarin subset.
- For instance, the "Sur" emotion is linked to all intent categories in the English data, while in Mandarin it is associated with only six intents ("Que", "Agr", "Con", "Sug", "Wis", and "Neu").
- Model Performance: Because of this complexity, models perform slightly worse on the English dataset compared with the Mandarin dataset.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: arXiv
Created: 7/3/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.