Back to datasets
Dataset assetOpen Source CommunityKnowledge GraphMovie Recommendation

movie Knowledge Graph Dataset

This is a movie knowledge‑graph dataset prepared for NebulaGraph, sourced from OMDB and MovieLens, intended for movie recommendation systems.

Source
github
Created
Nov 6, 2022
Updated
Apr 29, 2024
Signals
172 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Data Sources

  • Actor and movie genre data: sourced from OMDB.
  • User‑movie interaction records: sourced from MovieLens.

Dataset Structure

  • Vertex Types:

    • User (user_id)
    • Movie (name)
    • Person (name, birthdate)
    • Genre (name)
  • Edge Types:

    • watched (rate(double))
    • belongs_to_genre
    • directed_by
    • acted_in

Data Processing Workflow

  1. Raw data organization
  2. Load data into data warehouse (Postgres)
  3. Transform data into a format suitable for property‑graph models (dbt) and export as CSV
  4. Load CSV files into NebulaGraph (Nebula‑Importer)

Dataset Usage

  • The dataset is used to build a movie knowledge graph, supporting the NebulaGraph graph database.
  • For detailed usage, refer to this link.

Dataset Schema Mapping

  • The schema mapping details how the two tabular data sources are mapped to NebulaGraph's property‑graph model.

Dataset Validation

  • After importing into NebulaGraph, execute SHOW STATS; to verify data integrity and ensure correct loading.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio