Back to datasets
Dataset assetOpen Source CommunityRetail DataData Modeling

online_retail.csv

This dataset consists of the original retail data downloaded from Kaggle, intended for building an end‑to‑end data pipeline. The data includes retail transaction information, which can be modeled into fact and dimension tables and used for data quality checks.

Source
github
Created
Jul 8, 2024
Updated
Jul 19, 2024
Signals
349 views
Availability
Linked source ready
Overview

Dataset description and usage context

Retail Data Pipeline Dataset

Dataset Description

  • Data File Location: Within the folder dags/include/datasets/, containing the following files:
    • online_retail.csv: Original dataset downloaded from Kaggle.
    • country.csv: Dataset generated using a BigQuery table.

Technology Stack

  • Data Processing Tools:
    • Python
    • Docker and Docker‑compose
    • Soda.io
    • Metabase
    • Google Cloud Storage
    • Google BigQuery
    • Airflow (Astronomer edition)
    • dbt
    • GitHub
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio