JUHE API Marketplace
DATASET
Open Source Community

Scale-free Graphs

This dataset is designed for one‑to‑many graph translation tasks. The graphs have no node features; the goal is to learn a mapping from input graph topology to target graph topology. Each input graph is a directed scale‑free network with a power‑law degree distribution. For target graph generation, a node is selected as the target node with probability proportional to its indegree and connected to a new source node with probability 0.41. Similarly, a source node is chosen proportional to its outdegree and connects to a new target node with probability 0.54. Then, m edges (where m equals the number of input nodes) are added between the two nodes to form the target graph. Hence both input and target graphs are directed scale‑free graphs.

Updated 3/27/2021
github

Description

Dataset Overview

This dataset focuses on deep graph translation problems, providing various synthetic and real‑world graph datasets for studying and exploring mapping rules between graphs. The dataset includes the following sections:

1. Scale‑free Graphs

  • Task Type: Suitable for one‑to‑many graph translation.
  • Content: Five subsets of different sizes (10, 20, 50, 100, 150 nodes).
  • File Format: Input and output graphs are stored in .csv files named scale-(graph_size)-input-index.csv and scale-(graph_size)-target-index.csv respectively.
  • Data Links: Scale_free_150 etc.

2. Erdos‑Renyi Graphs

  • Task Type: Suitable for one‑to‑one graph translation.
  • Content: Three subsets of sizes 20, 40, 60 nodes.
  • File Format: Files named ER-(graph_size)-input-index.csv and ER-(graph_size)-target-index.csv.
  • Data Links: ER_20 etc.

3. Barabási‑Albert Graphs

  • Task Type: Suitable for one‑to‑one graph translation.
  • Content: Three subsets of sizes 20, 40, 60 nodes.
  • File Format: Files named BA-(graph_size)-input-index.csv and BA-(graph_size)-target-index.csv.
  • Data Links: BA_20 etc.

4. IoT

  • Context: Malware isolation in the Internet of Things.
  • Content: Three subsets of sizes 20, 40, 60 nodes.
  • File Format: Files named IoT-[graph_size]-[input/output]-[infection rate]-[recovery rate]-[decay rate]-[index].csv.
  • Data Links: IoT_20 etc.

5. User Authentication

  • Context: Predicting malicious behavior in enterprise network user authentication.
  • Content: Two subsets of sizes 50 and 300 nodes.
  • File Format: Files named Auth-[graph_size]-[input/output]-[index].csv.
  • Data Links: Auth_50 etc.

6. Chemistry Reaction

  • Context: Chemical reaction prediction.
  • Content: 7,180 pairs of reactant and product molecular graphs.
  • File Format: Data stored in multiple folders, e.g., mol_edge and mol_nodes.
  • Data Links: Mol_edge etc.

7. Molecule Optimization

  • Context: Molecular optimization via matched molecular pair analysis (MMPA) to improve chemical properties.
  • Content: Multiple tasks such as improving penalized logP and drug‑likeness (QED).
  • Data Links: Penalized logP etc.

These datasets provide rich resources for research on deep graph translation, supporting various mapping rules and application scenarios.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Graph Theory
Network Science

Source

Organization: github

Created: 3/7/2020

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.