Back to datasets
Dataset assetOpen Source CommunitySentiment AnalysisHotel Review Analysis
jniimi/tripadvisor-review-rating
This dataset contains hotel reviews and ratings collected from TripAdvisor. After processing, only the review text and multiple aspect scores are retained. Originally released by Jiwei Li et al., the processed data is provided as a single pandas DataFrame. It is primarily intended for aspect‑based sentiment analysis (ABSA). The dataset includes columns such as hotel ID, user ID, review title, review text, overall rating, cleanliness rating, and others.
Source
hugging_face
Created
Nov 28, 2025
Updated
Apr 24, 2024
Signals
393 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
- Name: TripAdvisor Easy Dataset
- Alias: sentiment
Dataset Features
- Feature List:
hotel_id: Unique hotel identifier, typeint64user_id: Unique user identifier, typestringtitle: Review title submitted by the user, typestringtext: Full review body, typestringoverall: Overall rating given by the user, typefloat64cleanliness: Cleanliness rating, typefloat64value: Value rating, typefloat64location: Location rating, typefloat64rooms: Room rating, typefloat64sleep_quality: Sleep quality rating, typefloat64stay_year: Year of stay, typeint64post_date: Review publication date, typetimestamp[ns]freq: Frequency, typeint64review: Full review, typestringchar: Number of characters, typeint64lang: Language, typestring
Dataset Splits
- Training Set:
num_examples: 201295num_bytes: 368237342
Dataset Size
- Download Size: 220909380
- Dataset Size: 368237342
Dataset Configuration
- Default Configuration:
config_name: defaultdata_files:split: trainpath: data/train-*
Task Categories
- Task: text-classification
Language
- Language: en
Size Category
- Size: 10K<n<100K
License
- License: Apache-2.0
Intended Uses
- Direct Use: Suitable for Aspect‑based Sentiment Analysis (ABSA)
- Out‑of‑Scope Use: Follow the original data source policy
Dataset Structure
- Column Information:
hotel_id: Unique hotel identifieruser_id: Unique user identifiertitle: Review titletext: Review bodyreview: Combined title + bodyoverall: Overall ratingcleanliness: Cleanliness ratingvalue: Value ratinglocation: Location ratingrooms: Room ratingsleep_quality: Sleep quality ratingdate_stayed: Stay datedate: Review publication date
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.