Explore high-quality datasets for your AI and machine learning projects.
The Ego‑QA‑19k dataset contains 19k video‑question‑answer pairs in first‑person view scenarios. Dataset creation involved two stages: first, video subtitles were concatenated chronologically to generate video descriptions, then GPT‑4 generated 20 questions per video; second, questions containing specific cue words were filtered out, and graduate‑level native English speakers ensured question authenticity and the required video length to answer each question.