Explore high-quality datasets for your AI and machine learning projects.
The nfcorpus dataset is a medical information retrieval collection comprising 5,371 documents, each with a document ID, URL, title, and abstract. It was introduced by Vera Boteva et al. at ECIR 2016 and serves as the basis for several related splits (e.g., `nfcorpus_dev`, `nfcorpus_test`).
This dataset includes aerial imagery and Sentinel‑2 imagery. The aerial imagery has dimensions 512×512×5, spatial resolution 0.2 m, and contains 5 channels (RGB, NIR, elevation). Sentinel‑2 imagery has spatial resolution of 10–20 m, includes 10 spectral bands, snow/cloud mask probability range 0–100, and temporal step format T×10×W×H. The label (mask) size is 512×512, includes 13 classes covering various types from buildings to other unclassified land covers.