Back to datasets
Dataset assetOpen Source CommunitySoftware DevelopmentIssue Tracking

gtxygyzb/github-issues

This dataset contains detailed information about GitHub issues, including URLs, status, creation and update timestamps, user information, labels, comments, etc. It is used for training and analysis, facilitating understanding and handling of issue tracking and collaboration processes on the GitHub platform.

Source
hugging_face
Created
Nov 28, 2025
Updated
Jul 31, 2023
Signals
68 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

github-issues

Dataset Features

Basic Features

  • url: string type
  • repository_url: string type
  • labels_url: string type
  • comments_url: string type
  • events_url: string type
  • html_url: string type
  • id: integer type
  • node_id: string type
  • number: integer type
  • title: string type

User Information

  • user: struct type, includes:
    • login: string type
    • id: integer type
    • node_id: string type
    • avatar_url: string type
    • gravatar_id: string type
    • url: string type
    • html_url: string type
    • followers_url: string type
    • following_url: string type
    • gists_url: string type
    • starred_url: string type
    • subscriptions_url: string type
    • organizations_url: string type
    • repos_url: string type
    • events_url: string type
    • received_events_url: string type
    • type: string type
    • site_admin: boolean type

Label Information

  • labels: list type, includes:
    • id: integer type
    • node_id: string type
    • url: string type
    • name: string type
    • color: string type
    • default: boolean type
    • description: string type

State Information

  • state: string type
  • locked: boolean type

Assignee Information

  • assignee: struct type, same fields as user
  • assignees: list type, same fields as assignee

Milestone Information

  • milestone: struct type, includes:
    • url: string type
    • html_url: string type
    • labels_url: string type
    • id: integer type
    • node_id: string type
    • number: integer type
    • title: string type
    • description: string type
    • creator: struct type, same as user
    • open_issues: integer type
    • closed_issues: integer type
    • state: string type
    • created_at: timestamp type
    • updated_at: timestamp type
    • due_on: null
    • closed_at: null

Other Information

  • comments: string sequence
  • created_at: timestamp type
  • updated_at: timestamp type
  • closed_at: timestamp type
  • author_association: string type
  • active_lock_reason: null
  • body: string type
  • reactions: struct type, includes:
    • url: string type
    • total_count: integer type
    • +1: integer type
    • -1: integer type
    • laugh: integer type
    • hooray: integer type
    • confused: integer type
    • heart: integer type
    • rocket: integer type
    • eyes: integer type
  • timeline_url: string type
  • performed_via_github_app: null
  • state_reason: string type
  • draft: boolean type
  • pull_request: struct type, includes:
    • url: string type
    • html_url: string type
    • diff_url: string type
    • patch_url: string type
    • merged_at: timestamp type
  • is_pull_request: boolean type

Dataset Splits

  • train: contains 1,000 examples, total size 10,109,253 bytes

Dataset Size

  • Download size: 3,048,310 bytes
  • Dataset size: 10,109,253 bytes
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio