Back to datasets
Dataset assetOpen Source CommunityLarge Language ModelsProcess Mining

ProcessTBench

ProcessTBench is a synthetic dataset for evaluating the planning capabilities of large language models (LLMs) within a process mining framework. Built upon TaskBench, it contains 532 base queries, each paraphrased 5–6 times, with an average of 4.08 solution plans per query. The dataset involves action sequences using 40 distinct tools and provides corresponding ground‑truth plans in Petri‑net format. Creation involved selecting the most challenging subset from TaskBench, generating plans with LLMs, and processing them using an event‑log parser and a plan‑conformance checker. ProcessTBench aims to support research on LLM plan generation in complex and dynamic environments, especially regarding multilingual and paraphrased queries.

Source
arXiv
Created
Sep 14, 2024
Updated
Sep 20, 2024
Signals
188 views
Availability
Linked source ready
Overview

Dataset description and usage context

ProcessTBench Dataset Overview

Dataset Content

Generated Plans and Variants

  • Description: Contains various plans to address the problem objectives in each process ID.
  • Generation Method: Produced using generate_plans_and_variants.py.

Paraphrased Queries

Process Models

  • Description: Process models of the generated plans.
  • Generation Method: Created with the Inductive Miner at thresholds 0, 0.1, and 0.2.
  • Additional Information: Convert the reference DAG of a query to a Petri net; example generation code is in dag_to_petri_net_results.py.

Conformance Quality

  • Description: Results of conformance checks evaluating the quality of paraphrased queries.
  • Generation Method: Produced using generate_plans_conformance_quality_rephrased.py and generate_plans_conformance_quality_original.py.

TaskBench Data

  • Files:
    • taskbench_multimedia.json
    • taskbench_multimedia_dag.json
    • taskbench_multimedia_dag_partitioned.json (multi‑process partitioning)
    • tool_desc_multimedia.json

Additional Files

  • Tools and Models:
    • utils.py
    • my_model.py (embedding and LLM model configuration)
    • readme.md
    • requirements.txt
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio