BullyDataset

A Sina Weibo comment dataset specifically collected for cyberbullying detection, where comments are labeled as bullying if they contain gender discrimination, racial or regional insults, profanity or humiliation, factual distortion, expressions of violence, attacks on appearance or family members, repetitive negative comments, calls for others to join the attack, or imposing unwanted or insulting nicknames.

Updated 1/16/2024

github

BullyDataset Overview

Dataset Description

Source: Sina Weibo comment
Purpose: Specifically for cyberbullying detection

Label Definition

Bullying Comment: A Weibo comment that satisfies any of the following conditions:
1. Uses gender‑discriminatory, racial or regional slurs.
2. Uses abusive or insulting language to criticize others without reasonable justification.
3. Clearly distorts facts or attempts to bias views on minority groups, making unfounded accusations.
4. Expresses violent tendencies or curses toward minority groups.
5. Contains attacks on a person’s appearance, body, or family members.
6. Repeatedly posts negative comments, or calls on others to join the attack.
7. Imposes an unwanted or insulting nickname on others.

Citation Information

Authors: Nijia Lu, Guohua Wu, Zhen Zhang, Yitao Zheng, Yizhi Ren, Kim‑Kwang Raymond Choo
Year: 2019
Paper Title: Cyberbullying Detection in Social Media Text Based on Character‑level Convolutional Neural Networks with Shortcuts
Contact: lunijia@hdu.edu.cn

BullyDataset

Description

BullyDataset Overview

Dataset Description

Label Definition

Citation Information

AI studio

Access Dataset

Topics

Source