InternVL-Chat-V1-2-SFT-Data
This dataset is used for visual question answering and QA tasks, supporting both Chinese and English. It includes multiple configuration files such as ai2d_train_12k, chartqa_train_18k, etc., each corresponding to different types of training data files.
Description
Dataset Overview
License
- Apache 2.0
Task Categories
- Visual Question Answering
- Question Answering
Languages
- English
- Chinese
Configurations
-
ai2d_train_12k
- Data files:
- Split: train
- Path: opensource/ai2d_train_12k.jsonl
- Data files:
-
chartqa_train_18k
- Data files:
- Split: train
- Path: opensource/chartqa_train_18k.jsonl
- Data files:
-
docvqa_train_10k
- Data files:
- Split: train
- Path: opensource/docvqa_train_10k.jsonl
- Data files:
-
dvqa_train_200k.jsonl
- Data files:
- Split: train
- Path: opensource/dvqa_train_200k.jsonl
- Data files:
-
geoqa+.jsonl
- Data files:
- Split: train
- Path: opensource/geoqa+.jsonl
- Data files:
-
llava_instruct_150k_zh.jsonl
- Data files:
- Split: train
- Path: opensource/llava_instruct_150k_zh.jsonl
- Data files:
-
sharegpt4v_instruct_gpt4-vision_cap100k.jsonl
- Data files:
- Split: train
- Path: opensource/sharegpt4v_instruct_gpt4-vision_cap100k.jsonl
- Data files:
-
sharegpt4v_mix665k_cap23k_coco-ap9k_lcs3k_sam9k_div2k.jsonl
- Data files:
- Split: train
- Path: opensource/sharegpt4v_mix665k_cap23k_coco-ap9k_lcs3k_sam9k_div2k.jsonl
- Data files:
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: huggingface
Created: 8/8/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.