JUHE API Marketplace
API CatalogDatasetsDocsBlog
API CatalogDatasetsDocsBlog

Dataset Catalog

Browse trusted datasets for evaluation, enrichment, and production use.

Category index
Showing 1 of 1 datasets
Category: Software Defect Prediction

GHPR Dataset

Software Defect PredictionGitHub Data Analysis

The GHPR dataset is used for empirical research and evaluation of software defect prediction. It is built from GitHub Pull Requests (PRs) and identifies 3,026 defect‑fix records. Each fix is treated as a record, yielding 6,052 learning instances (3,026 defective and 3,026 non‑defective). The dataset is provided in CSV and SQL formats and includes 16 features such as project name, project owner, project description, tags, programming language, pre‑ and post‑fix version IDs, defective code, commit description, commit time, pre‑ and post‑fix file contents, file‑path changes, PR title and description, etc.

Source githubUpdated Apr 29, 2024246 viewsLinked
Inspect dataset
JUHE API Marketplace

Accelerate development and ship production-grade integrations with APIs, MCP services, and AI-first infrastructure workflows.

For Developers

ConsoleDocumentation

Product

Browse APIsTemp Mail APIGlobal SMS

Company

What's NewContact SupportTerms Of ServicePrivacy Policy
Copyright © 2026 JUHEDATA HK LIMITED - All rights reserved