Back to datasets
Dataset assetOpen Source CommunitySocial Media AnalysisAIGC

Xiaohongshu AIGC Comments and Posts Dataset

The dataset is collected from the Xiaohongshu platform and focuses on user‑generated AI‑generated content (AIGC). It spans categories such as advertising, automotive, fashion, food, literature, printing, sports, and technology. The data include user comments and posts with fields for user ID, content, timestamp, like count, and sentiment analysis, enabling analysis of public opinions toward AIGC.

Source
github
Created
Nov 1, 2024
Updated
Nov 1, 2024
Signals
455 views
Availability
Linked source ready
Overview

Dataset description and usage context

Xiaohongshu AIGC Comments and Posts Dataset

Dataset Overview

This dataset is harvested from the "Xiaohongshu" platform and centers on user‑generated content about AI‑generated content (AIGC). It covers advertising, automotive, fashion, food, literature, printing, sports, and technology. The dataset contains user comments and posts, with fields for user ID, content, timestamp, like count, and sentiment analysis, facilitating analysis of public attitudes toward AIGC.

Data Structure

The dataset is organized as follows:

  • Data Directory: The dataset is divided into multiple theme folders (e.g., ai-Advertisement, ai-technology), each containing comments and posts for that theme.
  • File Structure:
    • Comments-<Theme>.csv: User comments for the specific theme.
    • Post-<Theme>.csv: Posts for the specific theme.

Example Data Structure

In the ai-technology folder, the file Comments-technological development.csv includes the following fields:

Field NameDescription
comment_idUnique identifier for the comment
create_timeCreation timestamp
ip_locationUser IP location
note_idPost ID associated with the comment
contentComment text
user_idUnique identifier for the user
nicknameUser nickname
avatarLink to user avatar
sub_comment_countNumber of sub‑comments
parent_comment_idParent comment ID
last_modify_tsLast modification timestamp
like_countNumber of likes
sentimentSentiment of the comment (e.g., positive, negative)

Example Data

csv comment_id,create_time,ip_location,note_id,content,user_id,nickname,avatar,sub_comment_count,parent_comment_id,last_modify_ts,like_count,sentiment 658e7ddd000000001a00e241,1703837149000,,658e7d1d0000000012004a26,"Six fingers aren’t obvious enough?",608af36300000000010063ee,momo,"https://sns-avatar-qc.xhscdn.com/avatar/1040g2...",303,0,1728458720283,28k,positive 658ef186000000001702da48,1703866758000,,658e7d1d0000000012004a26,"With this body type, there would be no collarbones sitting like that",58de279582ec3932ec4c73b5,"Momo in Renovation","https://sns-avatar-qc.xhscdn.com/avatar/58de27...",1059,0,1728458720285,15k,positive

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio