JUHE API Marketplace
DATASET
Open Source Community

complete_ufc_data.csv

This dataset integrates 30 years of UFC match history (starting from 1994), individual fighter statistics, and nine years of historical betting odds (starting from November 2014). It includes detailed information such as match date, name, weight class, fighter information, betting data, match outcomes, and methods of victory.

Updated 12/28/2023
github

Description

Dataset Overview

Dataset Contents

  • File name: /data/complete_ufc_data.csv
  • Description: This dataset aggregates 30 years of UFC match history (since 1994), fighter statistics, and nine years of historical betting odds (since November 2014).

Data Dictionary

ColumnExampleDescriptionSource
event_date2023-09-16UFC event dateExtracted from UFC match history
event_nameUFC Fight Night: Grasso vs. Shevchenko 2UFC event nameExtracted from UFC match history
weight_classWomens FlyweightUFC weight classExtracted from UFC match history
fighter1, fighter2Alexa Grasso, Valentina ShevchenkoFighter namesExtracted from UFC match history
favourite, underdogValentina Shevchenko, Alexa Grasso, NaNFavourite and underdog fightersHistorical odds from betmma.tips
favourite_odds, underdog_odds1.67, 2.88, NaNBetting odds (decimal)Historical odds from betmma.tips
betting_outcomefavourite, underdog, NaNBetting outcomeHistorical odds from betmma.tips
outcomefighter1, fighter2, DrawMatch resultExtracted from UFC match history
methodS-DEC, U-DEC, KO/TKO PunchesVictory methodExtracted from UFC match history
round5Winning roundExtracted from UFC match history
fighter1_*, fighter2_*Fighter attributesExtracted from UFC fighter statistics
events_extract_ts, odds_extract_ts, fighter_extract_ts2023-09-21 02:02:55.178363Data extraction timestamp

Data Extraction

  • Code: Python scripts were used for web scraping and data preprocessing.
  • Functionality: Completed UFC data scraping (fighter stats and match results), historical betting odds scraping, and data cleaning.

Exploratory Data Analysis (EDA) / Data Visualization

  • Insight: Historical win probability shows a strong correlation between age and average strikes per minute with match success. Younger fighters or those with higher strike output have a statistical advantage, winning about 60% of matches.
  • Insight: The historical probability that the favourite wins rises from slightly above 50% to over 75% when the decimal odds difference exceeds 2.0. Moreover, as the odds gap widens, this likelihood increases, reaching about 90% when the odds difference exceeds 4.5.

Predictive Modeling

  • Development status: Ongoing; machine‑learning models are being tested for predicting match outcomes based on fighter statistics.
  • Preliminary test: Initial models (GBM, logistic regression) achieve roughly 65% accuracy without betting odds features.
  • Future iterations: Planned testing of additional features such as win streaks, finish rates, derived attributes (endurance, wrestler/striker/slugger tags) and whether a fighter is a betting favourite.

Setup

  • Dependency management: Managed with Poetry or pip.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Sports Data Analysis
UFC

Source

Organization: github

Created: 9/19/2023

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.