JUHE API Marketplace
DATASET
Open Source Community

NETEMVocabulary

The 2024 National Master's Graduate Entrance Exam English (I) Syllabus Vocabulary List contains 5,530 required words. Based on approximately 200 test papers from CET‑4/6, graduate English, and specialized English exams, the vocabulary list was ranked by frequency of occurrence. A lemmatization strategy was used for ranking. The top 2,444 words appear more than 40 times, i.e., roughly once every five test papers, and are considered true high‑frequency words. Definitions were manually cross‑checked and alternative spellings were listed to ensure data accuracy.

Updated 5/20/2024
github

Description

Graduate Entrance Exam Vocabulary Frequency Ranking Dataset Overview

Dataset Description

  • Vocabulary Source: The 2024 National Master's Graduate Entrance Exam English (I) Syllabus Vocabulary List, containing 5,530 entries.
  • Frequency Statistics: Frequency ranking of the vocabulary list based on approximately 200 test paper texts from CET‑4/6, graduate English exams, and specialized English exams.
  • Ranking Method: Utilizes lemmatization strategy, which may differ slightly from the actual exam presentation.
  • High‑Frequency Vocabulary: The top 2,444 words appear more than 40 times, averaging one occurrence every five test papers.
  • Data Accuracy: Definitions have undergone preliminary manual verification to ensure correctness. Alternate spellings are included for each word.

Data Storage

  • Data Files: netem_full_list.json stores all data and has been converted into a netem_full_list.sql file.

Dataset Usage

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

English Exam
Vocabulary Analysis

Source

Organization: github

Created: 10/3/2022

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.