JUHE API Marketplace
DATASET
Open Source Community

Xiaohongshu AIGC Comments and Posts Dataset

The dataset is collected from the Xiaohongshu platform and focuses on user‑generated AI‑generated content (AIGC). It spans categories such as advertising, automotive, fashion, food, literature, printing, sports, and technology. The data include user comments and posts with fields for user ID, content, timestamp, like count, and sentiment analysis, enabling analysis of public opinions toward AIGC.

Updated 11/1/2024
github

Description

Xiaohongshu AIGC Comments and Posts Dataset

Dataset Overview

This dataset is harvested from the "Xiaohongshu" platform and centers on user‑generated content about AI‑generated content (AIGC). It covers advertising, automotive, fashion, food, literature, printing, sports, and technology. The dataset contains user comments and posts, with fields for user ID, content, timestamp, like count, and sentiment analysis, facilitating analysis of public attitudes toward AIGC.

Data Structure

The dataset is organized as follows:

  • Data Directory: The dataset is divided into multiple theme folders (e.g., ai-Advertisement, ai-technology), each containing comments and posts for that theme.
  • File Structure:
    • Comments-<Theme>.csv: User comments for the specific theme.
    • Post-<Theme>.csv: Posts for the specific theme.

Example Data Structure

In the ai-technology folder, the file Comments-technological development.csv includes the following fields:

Field NameDescription
comment_idUnique identifier for the comment
create_timeCreation timestamp
ip_locationUser IP location
note_idPost ID associated with the comment
contentComment text
user_idUnique identifier for the user
nicknameUser nickname
avatarLink to user avatar
sub_comment_countNumber of sub‑comments
parent_comment_idParent comment ID
last_modify_tsLast modification timestamp
like_countNumber of likes
sentimentSentiment of the comment (e.g., positive, negative)

Example Data

csv comment_id,create_time,ip_location,note_id,content,user_id,nickname,avatar,sub_comment_count,parent_comment_id,last_modify_ts,like_count,sentiment 658e7ddd000000001a00e241,1703837149000,,658e7d1d0000000012004a26,"Six fingers aren’t obvious enough?",608af36300000000010063ee,momo,"https://sns-avatar-qc.xhscdn.com/avatar/1040g2...",303,0,1728458720283,28k,positive 658ef186000000001702da48,1703866758000,,658e7d1d0000000012004a26,"With this body type, there would be no collarbones sitting like that",58de279582ec3932ec4c73b5,"Momo in Renovation","https://sns-avatar-qc.xhscdn.com/avatar/58de27...",1059,0,1728458720285,15k,positive

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

AIGC
Social Media Analysis

Source

Organization: github

Created: 11/1/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.