Back to datasets
Dataset assetOpen Source CommunityTraffic Data AnalysisTaxi Services

New York City Yellow Taxi Trip data

The NYC Yellow Taxi trip records capture pickup and drop‑off dates and times, pickup and drop‑off locations, trip distance, itemized fare details, rate type, payment type, and the number of passengers reported by the driver.

Source
github
Created
Nov 14, 2024
Updated
Nov 15, 2024
Signals
543 views
Availability
Linked source ready
Overview

Dataset description and usage context

NYC Yellow Taxi Tripdata Analytics | Microsoft Azure Data Engineering Project

Dataset Overview

Dataset Description

NYC Yellow trip records contain the following fields:

  • Pickup and drop‑off dates and times
  • Pickup and drop‑off locations
  • Trip distance
  • Fare details
  • Rate type
  • Payment type
  • Driver‑reported passenger count

Dataset Source

Dataset Usage

This dataset underpins a comprehensive Azure data engineering project aimed at ingesting, transforming, analyzing, and visualizing New York City taxi trip data.

Project Architecture

Architecture Diagram

Azure Services Used

  • Azure Data Factory (ADF)
  • Azure Data Lake Storage Gen2 (ADLS Gen2)
  • Azure Databricks
  • Azure Synapse Analytics
  • Key Vault
  • Azure Active Directory
  • Power BI

Languages Used

  • Programming languages: Python, PySpark
  • Scripting language: SQL

Data Model

Data Model

Power BI Dashboard

Power BI analysis dashboard

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio