MyFixit
MyFixit is a repair‑manual dataset collected from the iFixit website, containing 31,601 manuals covering 15 device categories. Each step in the Mac Laptop category is annotated with required tools, disassembled parts, and removal verbs (totaling 1,497 manuals with 36,659 steps). Other categories are not yet manually annotated.
Description
MyFixit Dataset
Dataset Overview
MyFixit is a collection of repair manuals scraped from the iFixit website, comprising 31,601 manuals across 15 device categories. In the Mac Laptop category, each step is annotated with required tools, disassembled parts, and removal verbs (totaling 1,497 manuals with 36,659 steps). Other categories are currently unannotated.
Example Entry
Below is an example of an annotated step from the dataset:
json { "Title": "MacBook Unibody Model A1278 Hard Drive Replacement", "Ancestors": ["MacBook", "Mac Laptop", "Mac", "Root"], "Guidid": 816, "Category": "MacBook Unibody Model A1278", "Subject": "Hard Drive", "Toolbox": [ {"Name": ["phillips 00 screwdriver"], "Url": "https://www.ifixit.com/Store/Parts/Phillips-00-Screwdriver/IF145-006", "Thumbnail": "https://da2lh5cs8ikqj.cloudfront.net/cart-products/rLfPqcRxAVqNxfwc.mini"}, {"Name": ["spudger"], "Url": "http://www.ifixit.com/Tools/Spudger/IF145-002", "Thumbnail": "https://da2lh5cs8ikqj.cloudfront.net/cart-products/fIQ3oZSjd1yLgqpX.mini"}, {"Name": ["t6 torx screwdriver"], "Url": "https://www.ifixit.com/Store/Tools/TR6-Torx-Security-Screwdriver/IF145-225", "Thumbnail": ""} ], "Url": "https://www.ifixit.com/Guide/MacBook+Unibody+Model+A1278+Hard+Drive+Replacement/816", "Steps": [{ "Order": 1, "Tools_annotated": ["NA"], "Tools_extracted": ["NA"], "Word_level_parts_raw": [{"name": "battery", "span": [19, 19]}], "Word_level_parts_clean": ["battery"], "Removal_verbs": [{"name": "pull out", "span": [17, 17], "part_index": [0]}], "Lines": [ {"Text": "be sure the access door release latch is vertical before proceeding."}, {"Text": "grab the white plastic tab and pull the battery up and out of the unibody."} ], "Text_raw": "Be sure the access door release latch is vertical before proceeding. Grab the white plastic tab and pull the battery up and out of the Unibody.", "Images": ["https://d3nevzfk7ii3be.cloudfront.net/igi/WkwQip2DfR1iJLMX.standard"], "StepId": 4122 }, ...] }
Statistics
The dataset provides the number of manuals and unique‑text steps per category (e.g., Mac, Car and Truck, Household, etc.). Each category has a JSON file containing all collected manuals with multiple steps and tools. Manual disassembly instructions are not included.
Data Format
JSON files store one JSON object per line.
Search Script
A simple search.py script is provided to locate relevant manuals and export them as XML or JSON. Parameters include:
-device: optional device name-input: required filename in thejsons/directory-part: optional device part-format: output format (XML or JSON, default JSON)-output: required output filename-mintools: minimum number of tools per manual-minsteps: minimum number of steps per manual-verbose: print selected manual titles-annotatedtool: select manuals with tool annotations only-annotatedpart: select manuals with part annotations only
Example Command
python search.py -input Mac.json -output tmp -device macbook pro -part battery -mintools 2 -minsteps 15 -format xml -verbose -annotatedtool -annotatedpart
Sample Output
Total number of matched manuals :29
Title of manuals:
MacBook Pro 17" Models A1151 A1212 A1229 and A1261 Battery Connector Replacement
... (list truncated) ...
Selected manuals are saved in tmp.xml
Citation
If you find this dataset useful, please cite:
@InProceedings{nabizadeh‑kolossa‑heckmann:2020:LREC, author = {Nabizadeh, Nima and Kolossa, Dorothea and Heckmann, Martin}, title = {MyFixit: An Annotated Dataset, Annotation Tool, and Baseline Methods for Information Extraction from Repair Manuals}, booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference}, month = {May}, year = {2020}, address = {Marseille, France}, publisher = {European Language Resources Association}, pages = {2120--2128} }
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 12/2/2019
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.