JUHE API Marketplace
DATASET
Open Source Community

TannerGladson/lichess-frames

The dataset consists of chess‑board states and associated move sequences extracted from games downloaded from lichess.org. Each game is parsed into multiple records; each record starts with a FEN string followed by 1‑10 SAN moves. The data are intended for training the ChessRoberta model and have not been filtered, so they may not be optimal for high‑performance chess modelling.

Updated 5/12/2024
hugging_face

Description

Dataset Overview

Dataset Name

Dataset Name

Source

The dataset’s game sequences are sourced from https://database.lichess.org/.

Content

The dataset contains chess board states and their associated move sequences. PGN files were downloaded from the Lichess database; each game is parsed into multiple records, each beginning with a FEN string followed by 1‑10 SAN moves.

Intended Use

The data will be used to train the ChessRoberta model, but because they are unfiltered they may not be suitable for building a high‑performance chess engine.

Structure

Each record includes the following fields:

  • text (Str): String containing the FEN and multiple SAN moves.
  • pgn_start (Int): Index of the first SAN within the text string.
  • num_sans (Int): Number of half‑moves present in the text string.
  • num_prior_moves (Int): Number of half‑moves that occurred before the FEN (one move for each side counts as two moves).
  • game_id (Str): Lichess identifier for the source game.

Special markers are used as delimiters in the text field:

  • PGN_START: "~"
  • MOVE_SEP: ">"

Example Record

{
  text: "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1~e4>e6>d3>d5>Nd2",
  pgn_start: 57,
  num_sans: 5,
  num_prior_moves: 0,
  game_id: "https://lichess.org/PwE2cWn3"
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Chess
Machine Learning

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.