상세 보기
HateBertBN: a hybrid transformer based model for Bangla hate speech detection across various social contexts
- Azhar, Tanvir;
- Mahmud, Tahsin;
- Hasan, Muhammad Asif;
- Uddin, Mohammed Nazim;
- Park, Seung-Bo
WEB OF SCIENCE
0SCOPUS
0초록
The widespread use of online social media platforms has amplified the importance of efficient hate speech detection, especially in low-resource languages like Bengali. While traditional machine learning approaches show promise, deep learning is more effective in capturing the nuanced context of hate speech. Current challenges include a lack of diverse datasets and models capable of context-sensitive detection. To address these, we introduce HateCorpBN-XL, the largest labeled Bengali hate speech dataset to date, containing 65,251 comments across five categories: political (PoHS), religious (ReHS), misogynistic (MisoHS), slander (SlaHS), and xenophobic (XenHS). We also propose HateBertBN, a hybrid transformer-based model combining BanglaBERT embeddings with three neural network fusion strategies using CNN, LSTM, and MLP. We evaluate our approach on two tasks, Task-1: detecting hate speech in Bengali text classifying it as hateful or non-hateful and Task-2: categorizing hateful content into five distinct classes. For Task-1, all HateBertBN variants outperformed current transformer models, achieving an accuracy of 0.92 and a weighted F1-score of 0.92. In Task-2, the HateBertBN-MLP and HateBertBN-CNN variants achieved a notable 0.90 accuracy and weighted F1-score of 0.90, surpassing M-BERT, Distil-M-BERT, BanglaBERT, and XLM-R-Base. Although HateBertBN-LSTM performed slightly lower overall, it achieved strong F1-scores in the ReHS (0.93) and XenHS (1.00) categories. Overall, our hybrid model outperforms state-of-the-art approaches in both tasks, demonstrating its effectiveness and robustness.
키워드
- 제목
- HateBertBN: a hybrid transformer based model for Bangla hate speech detection across various social contexts
- 저자
- Azhar, Tanvir; Mahmud, Tahsin; Hasan, Muhammad Asif; Uddin, Mohammed Nazim; Park, Seung-Bo
- 발행일
- 2026-01-08
- 유형
- Article
- 저널명
- DISCOVER COMPUTING
- 권
- 29
- 호
- 1