netop/TeleQnA

TeleQnA is a comprehensive dataset designed to evaluate large language models' knowledge in the telecommunications domain. It comprises 10,000 multiple‑choice questions divided into five categories: Lexicon (500), Research Overview (2,000), Research Publications (4,500), Standards Overview (1,000) and Standards Specifications (2,000). Each question is represented in JSON format with five fields: question, options, answer, explanation, and category. Experimental code is provided to assess the performance of OpenAI models such as GPT‑3.5.

Updated 1/30/2024

hugging_face

Description

TeleQnA Dataset Overview

Dataset Introduction

TeleQnA is a comprehensive dataset intended to evaluate large language models (LLMs) in the telecommunications domain. The dataset contains 10,000 multiple‑choice questions distributed across five distinct categories:

Lexicon (Vocabulary): Contains 500 questions covering general telecom terminology and definitions.
Research Overview: Contains 2,000 questions offering a broad overview of telecom research, covering a wide range of telecom‑related topics.
Research Publications: Contains 4,500 questions concerning multidisciplinary research in telecom, referencing various sources such as journals and conference proceedings.
Standards Overview: Contains 1,000 questions covering summaries of standards from multiple standardization bodies (e.g., 3GPP, IEEE).
Standards Specifications: Contains 2,000 questions exploring technical specifications and implementations of telecom systems, referencing information from standardization bodies (e.g., 3GPP, IEEE).

Dataset Format

Each question is represented in JSON format with five distinct fields:

Question: A string presenting a question related to a specific concept in telecom.
Options: A set of strings representing answer choices.
Answer: A string in the format 'option ID: Answer' indicating the correct answer. Only one option is correct; however, options may include 'All of the above' or 'Option 1 and 2', etc.
Explanation: A string explaining the rationale for the correct answer.
Category: A label identifying the source category (e.g., lexicon, research overview, etc.).

Dataset Example

{
    "question": "What is the maximum number of eigenmodes that the MIMO channel can support? (nt is the number of transmit antennas, nr is the number of receive antennas)",
    "option 1": "nt",
    "option 2": "nr",
    "option 3": "min(nt, nr)",
    "option 4": "max(nt, nr)",
    "answer": "option 3: min(nt, nr)",
    "explanation": "The maximum number of eigenmodes that the MIMO channel can support is min(nt, nr).",
    "category": "Research publications"
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Please login to view download links and access full dataset details.

Topics

Telecommunications

Knowledge Assessment

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →