Back to datasets
Dataset assetOpen Source CommunityRetail DataData Modeling
online_retail.csv
This dataset consists of the original retail data downloaded from Kaggle, intended for building an end‑to‑end data pipeline. The data includes retail transaction information, which can be modeled into fact and dimension tables and used for data quality checks.
Source
github
Created
Jul 8, 2024
Updated
Jul 19, 2024
Signals
349 views
Availability
Linked source ready
Overview
Dataset description and usage context
Retail Data Pipeline Dataset
Dataset Description
- Data File Location: Within the folder
dags/include/datasets/, containing the following files:online_retail.csv: Original dataset downloaded from Kaggle.country.csv: Dataset generated using a BigQuery table.
Technology Stack
- Data Processing Tools:
- Python
- Docker and Docker‑compose
- Soda.io
- Metabase
- Google Cloud Storage
- Google BigQuery
- Airflow (Astronomer edition)
- dbt
- GitHub
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.