Explore high-quality datasets for your AI and machine learning projects.
DateLogicQA, created by the University of Aberdeen, is a benchmark dataset for evaluating large language models' temporal reasoning abilities. It contains 190 questions covering various date formats, temporal contexts, and reasoning types. The dataset is designed to test models' understanding and inference of dates across past, present, and future contexts, especially handling diverse date formats and preserving semantic meaning. It enables researchers to analyze LLM performance on temporal reasoning tasks and identify biases related to time data, with applications in event planning, historical QA, and other scenarios requiring precise temporal inference.