Explore high-quality datasets for your AI and machine learning projects.
NAVCON is a large‑scale Vision‑Language Navigation (VLN) corpus created by the University of Pennsylvania, built on top of the R2R and RxR datasets. It contains 30,815 instructions with 236,316 concept annotations and aligns them with 2.7 million paired images, illustrating the visual context encountered by agents while following instructions. The corpus was generated using cognitive heuristics and language foundations, producing silver‑standard annotations that were subsequently human‑validated for quality. NAVCON is primarily intended for language‑guided navigation tasks, aiming to improve models' abilities to comprehend and execute natural language commands, especially in cross‑modal alignment and concept recognition.