Back to datasets
Dataset assetOpen Source CommunityRecommender SystemsMovie Recommendation

MovieLens 32M and MovieLens 1B Synthetic Dataset

This project uses the MovieLens 32M and MovieLens 1B Synthetic Datasets to demonstrate an advanced recommendation system developed for a media streaming platform (inspired by Netflix). The system employs a hybrid approach that combines collaborative filtering, content‑based filtering, and graph‑based recommendation to provide personalized movie suggestions.

Source
github
Created
Aug 5, 2024
Updated
Aug 12, 2024
Signals
205 views
Availability
Linked source ready
Overview

Dataset description and usage context

MovieLens Recommendation System

This project showcases a complex recommendation system developed for a media streaming platform (inspired by Netflix) using the MovieLens 1B synthetic dataset. The system adopts a hybrid approach that combines collaborative filtering, content‑based filtering, and graph‑based recommendation to deliver personalized movie suggestions.

Project Structure

├── LICENSE <- Open‑source license (if selected) │ ├── Makefile <- Makefile with convenient commands such as make data or make train │ ├── README.md <- Top‑level README for developers using this project. │ ├── data │ │ │   ├── external <- Data from third‑party sources. │ │ │   ├── interim <- Converted intermediate data. │ │ │   ├── processed <- Final curated dataset for modeling. │ │ │   └── raw <- Immutable raw data dump. │ ├── environment.yml <- Requirements file for reproducing the analysis environment, e.g., generated via pip freeze > requirements.txt │ ├── models <- Trained serialized models, model predictions, or model summaries │ ├── notebooks <- Jupyter notebooks. Naming convention: number (for ordering), author initial, and a brief - separated description, e.g., 1.0-jqp-initial-data-exploration │ └── src <- Source code for this project │ ├── init.py <- Makes src a Python module │ ├── dataset.py <- Scripts for downloading or generating data │ ├── features.py <- Code for creating modeling features │ ├── modeling │ │ │   ├── init.py │ │ │   ├── predict.py <- Code for running model inference with a trained model │ │ │   └── train.py <- Code for training a model │ └── plots.py <- Code for creating visualizations

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio