JUHE API Marketplace
DATASET
Open Source Community

allenai/s2-naip

AI2‑S2‑NAIP is a remote‑sensing dataset that includes aligned NAIP, Sentinel‑2, Sentinel‑1 and Landsat imagery covering the entire contiguous United States. The data are tiled into 512 × 512‑pixel patches at 1.25 m/pixel resolution and are distributed across ten UTM zones. Each tile contains multiple data modalities, such as NAIP images (2019–2021, 1.25 m/pixel), Sentinel‑2 and Sentinel‑1 images (10 m/pixel), Landsat‑8/9 images (10 m/pixel), OpenStreetMap GeoJSON (buildings, roads, etc.), and the 2021 WorldCover land‑cover map (10 m/pixel). These data support a range of supervised and unsupervised remote‑sensing tasks, including super‑resolution, segmentation, detection, and multimodal mask auto‑encoder pre‑training.

Updated 5/31/2024
hugging_face

Description

AI2‑S2‑NAIP Dataset Overview

Dataset Introduction

AI2‑S2‑NAIP is a remote‑sensing dataset that contains aligned NAIP, Sentinel‑2, Sentinel‑1 and Landsat imagery covering the entire continental United States.

Data Structure

The data are divided into multiple tiles, each 512 × 512 pixels at a resolution of 1.25 m/pixel, located within one of ten UTM projections across the United States.

Data Types Included in Each Tile:

  • National Agriculture Imagery Program (NAIP): Images from 2019‑2021 at 1.25 m/pixel (512 × 512).
  • Sentinel‑2 (L1C): 16‑32 images captured a few months after the NAIP acquisition, at 10 m/pixel (64 × 64).
  • Sentinel‑1: 2‑8 images captured a few months after the NAIP acquisition, at 10 m/pixel (64 × 64).
  • Landsat‑8/9: Four images captured in the same year as the NAIP acquisition, at 10 m/pixel (64 × 64).
  • OpenStreetMap: GeoJSON containing buildings, roads and 30 other categories, using pixel coordinates relative to the 512 × 512 NAIP image.
  • WorldCover: 2021 land‑cover map at 10 m/pixel (64 × 64).

Dataset Applications

AI2‑S2‑NAIP is suitable for a variety of supervised and unsupervised remote‑sensing tasks, including super‑resolution (e.g., NAIP → Sentinel‑2), segmentation and detection (e.g., NAIP or Sentinel‑2 → OpenStreetMap or WorldCover), and multimodal mask auto‑encoder pre‑training.

Data File Structure

After extraction, different data types are stored in separate folders. Files within each folder are named by tile ID, which consists of the UTM zone, column and row.

Example

  • NAIP Data:

    naip/ 32612_960_-6049.png 32612_960_-6050.png 32612_960_-6051.png ...

  • Sentinel‑2 Data:

    sentinel2/ 32612_960_-6049_16.tif 32612_960_-6049_32.tif 32612_960_-6049_8.tif 32612_960_-6050_16.tif ...

Image Details

  • Sentinel‑2: Bands stored at 10 m/pixel (_8.tif), 20 m/pixel (_16.tif) and 60 m/pixel (_32.tif). Pixel values are 16‑bit L1C.
  • Sentinel‑1: 10 m/pixel resolution, band order VV then VH. Pixel values are 32‑bit floating point representing dB (10 × log10(x)).
  • NAIP: 512 × 512 images with four 8‑bit bands (R, G, B, IR). IR serves as an alpha mask and must be removed for proper visualization.
  • Landsat: OLI‑TIRS images from Landsat‑8 and Landsat‑9. Bands stored at 10 m/pixel (_8.tif) and 20 m/pixel (_16.tif). Pixel values are 16‑bit.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Remote Sensing
Geographic Information System (GIS)

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.