DATASET
Open Source Community
complete_ufc_data.csv
This dataset integrates 30 years of UFC match history (starting from 1994), individual fighter statistics, and nine years of historical betting odds (starting from November 2014). It includes detailed information such as match date, name, weight class, fighter information, betting data, match outcomes, and methods of victory.
Updated 12/28/2023
github
Description
Dataset Overview
Dataset Contents
- File name:
/data/complete_ufc_data.csv - Description: This dataset aggregates 30 years of UFC match history (since 1994), fighter statistics, and nine years of historical betting odds (since November 2014).
Data Dictionary
| Column | Example | Description | Source |
|---|---|---|---|
event_date | 2023-09-16 | UFC event date | Extracted from UFC match history |
event_name | UFC Fight Night: Grasso vs. Shevchenko 2 | UFC event name | Extracted from UFC match history |
weight_class | Womens Flyweight | UFC weight class | Extracted from UFC match history |
fighter1, fighter2 | Alexa Grasso, Valentina Shevchenko | Fighter names | Extracted from UFC match history |
favourite, underdog | Valentina Shevchenko, Alexa Grasso, NaN | Favourite and underdog fighters | Historical odds from betmma.tips |
favourite_odds, underdog_odds | 1.67, 2.88, NaN | Betting odds (decimal) | Historical odds from betmma.tips |
betting_outcome | favourite, underdog, NaN | Betting outcome | Historical odds from betmma.tips |
outcome | fighter1, fighter2, Draw | Match result | Extracted from UFC match history |
method | S-DEC, U-DEC, KO/TKO Punches | Victory method | Extracted from UFC match history |
round | 5 | Winning round | Extracted from UFC match history |
fighter1_*, fighter2_* | Fighter attributes | Extracted from UFC fighter statistics | |
events_extract_ts, odds_extract_ts, fighter_extract_ts | 2023-09-21 02:02:55.178363 | Data extraction timestamp |
Data Extraction
- Code: Python scripts were used for web scraping and data preprocessing.
- Functionality: Completed UFC data scraping (fighter stats and match results), historical betting odds scraping, and data cleaning.
Exploratory Data Analysis (EDA) / Data Visualization
- Insight: Historical win probability shows a strong correlation between age and average strikes per minute with match success. Younger fighters or those with higher strike output have a statistical advantage, winning about 60% of matches.
- Insight: The historical probability that the favourite wins rises from slightly above 50% to over 75% when the decimal odds difference exceeds 2.0. Moreover, as the odds gap widens, this likelihood increases, reaching about 90% when the odds difference exceeds 4.5.
Predictive Modeling
- Development status: Ongoing; machine‑learning models are being tested for predicting match outcomes based on fighter statistics.
- Preliminary test: Initial models (GBM, logistic regression) achieve roughly 65% accuracy without betting odds features.
- Future iterations: Planned testing of additional features such as win streaks, finish rates, derived attributes (endurance, wrestler/striker/slugger tags) and whether a fighter is a betting favourite.
Setup
- Dependency management: Managed with Poetry or pip.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Sports Data Analysis
UFC
Source
Organization: github
Created: 9/19/2023
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.