Explore high-quality datasets for your AI and machine learning projects.
该数据集使用安装在Stevens Institute of Technology校园内的UGV - Clearpath Jackal上的Velodyne VLP-16捕获,VLP-16的旋转速率为10Hz。数据集包含超过20,000次扫描和多个闭环。TF变换由LeGO-LOAM提供,整个映射过程也被记录在视频中。
This is a synthetic dataset designed to facilitate the development and evaluation of 3D lane detection methods. It extends the [Apollo Synthetic Dataset](http://apollo.auto/synthetic.html), with construction strategy and evaluation methods based on the ECCV 2020 paper: Gen‑LaneNet: A Generalized and Scalable 3D Lane Detection Approach.
The Acti dataset, created by Beihang University, focuses on mining cybersecurity threat intelligence entities and their relations for autonomous driving vehicles. It contains 908 real automotive cybersecurity reports, comprising 3,678 sentences, 8,195 security entities, and 4,852 semantic relations. Data were collected from the National Vulnerability Database and specific automotive threat intelligence platforms, and annotated using a BIOES joint labeling scheme. The dataset is primarily used for modeling automotive cybersecurity threat intelligence, aiming to extract valuable information from large volumes of cybersecurity data for proactive defense.
The DriveLM dataset supports perception, prediction, planning, behavior and motion tasks through graph‑structured question‑answer pairs. It consists of two parts: DriveLM‑nuScenes and DriveLM‑CARLA. DriveLM‑nuScenes is built on the nuScenes dataset, while DriveLM‑CARLA is collected from the CARLA simulator. Currently, only the training split of DriveLM‑nuScenes is publicly available. The dataset includes a series of questions and answers together with the associated images.
The DriveMLLM dataset, created by the Institute of Automation, Chinese Academy of Sciences and other institutions, focuses on spatial understanding tasks in autonomous driving scenarios. It contains 880 forward‑camera images covering absolute and relative spatial reasoning tasks, accompanied by rich natural‑language questions. Built upon the nuScenes dataset, the images were strictly selected and annotated to ensure clear visibility of objects and explicit spatial relationships. DriveMLLM aims to evaluate and improve multimodal large language models' spatial reasoning abilities in autonomous driving, addressing complex spatial relation understanding.
The FB‑SSEM dataset is a synthetic dataset comprising surround‑view fisheye camera images and BEV (bird’s‑eye‑view) maps generated from simulated ego‑vehicle motion sequences.
LoT‑nuScenes is a virtual long‑tail scenario dataset for parallel vision and parallel vehicles. Built in the CARLA simulator, it contains accident scenarios under various conditions, including six types of motor vehicle accidents and one pedestrian accident, combined with three extreme weather conditions, three time periods, and five location categories. The dataset follows the nuScenes format, equipped with multi‑sensor and 360° views, filling the gap in accident scenario data and providing a long‑tail standardized distribution.
TSEC‑Dataset was developed for training and testing video captioning methods in driving scenarios, aiming to describe key events involving the ego vehicle, road environment, and other traffic participants. The dataset aggregates videos from various sources, including on‑board cameras, public datasets, and traffic‑accident videos downloaded from BiliBili and YouTube, to capture diverse traffic scenes. Videos are segmented into independent clips containing 1‑3 key events, totaling 8,000 video clips with a cumulative duration of 11.5 hours.