CBT-Bench
CBT‑Bench is a benchmark dataset aimed at evaluating large language models (LLMs) in assisting cognitive behavioral therapy (CBT). The dataset consists of three levels, each focusing on different key aspects of CBT, including basic knowledge recall, cognitive model understanding, and therapeutic response generation. Its goal is to assess LLMs’ support capability across various stages of professional mental health care, particularly CBT. The dataset includes multiple tasks and data files such as multiple‑choice questions, cognitive distortion classification, and therapeutic response generation exercises.
Description
CBT-Bench Dataset
概述
CBT-Bench 是一个用于评估大型语言模型(LLMs)在辅助认知行为疗法(CBT)中熟练程度的基准数据集。数据集分为三个层次,每个层次关注 CBT 的不同关键方面,包括基础知识背诵、认知模型理解和治疗响应生成。目标是评估 LLMs 在专业心理健康护理各个阶段的支持能力,特别是 CBT。
数据集结构
数据集分为三个主要层次,每个层次包含特定任务:
第一层:基础 CBT 知识获取
- 数据集: CBT-QA (
qa_test.json) - 描述: 包含 220 个与 CBT 概念、实用知识、案例研究等相关的多项选择题。
qa_seed.json包含用于训练或上下文学习的保留示例。由于这些是 CBT 考试问题,目前无法披露答案。未来,我们可能会考虑将其转化为排行榜。
第二层:认知模型理解
- 数据集:
- CBT-CD (
distortions_test.json) (认知扭曲分类): 146 个认知扭曲示例,分为十类,如全有或全无思维、个人化、读心术等。 - CBT-PC (
core_major_test.json) (主要核心信念分类): 184 个示例分为三个核心信念(无助、不可爱、无价值)。 - CBT-FC (
core_fine_test.json) (细粒度核心信念分类): 112 个示例进一步分为 19 个细粒度核心信念类别。 distortions_seed.json,core_major_seed.json, 和core_fine_seed.json包含用于训练或上下文学习的保留示例。
- CBT-CD (
第三层:治疗响应生成
- 数据集: CBT-DP (
CBT-DP/) - 描述: 包含 156 个练习,分为十个关键的 CBT 会话方面,涵盖各种治疗场景,难度逐渐增加。除了人类参考外,我们还发布了模型生成内容。
引用
@misc{zhang2024cbtbenchevaluatinglargelanguage, title={CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy}, author={Mian Zhang and Xianjun Yang and Xinlu Zhang and Travis Labrum and Jamie C. Chiu and Shaun M. Eack and Fei Fang and William Yang Wang and Zhiyu Zoey Chen}, year={2024}, eprint={2410.13218}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2410.13218}, }
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: huggingface
Created: 10/19/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.