HateBertBN: a hybrid transformer based model for Bangla hate speech detection across various social contexts

  • Azhar, Tanvir
  • Mahmud, Tahsin
  • Hasan, Muhammad Asif
  • Uddin, Mohammed Nazim
  • Park, Seung-Bo
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

The widespread use of online social media platforms has amplified the importance of efficient hate speech detection, especially in low-resource languages like Bengali. While traditional machine learning approaches show promise, deep learning is more effective in capturing the nuanced context of hate speech. Current challenges include a lack of diverse datasets and models capable of context-sensitive detection. To address these, we introduce HateCorpBN-XL, the largest labeled Bengali hate speech dataset to date, containing 65,251 comments across five categories: political (PoHS), religious (ReHS), misogynistic (MisoHS), slander (SlaHS), and xenophobic (XenHS). We also propose HateBertBN, a hybrid transformer-based model combining BanglaBERT embeddings with three neural network fusion strategies using CNN, LSTM, and MLP. We evaluate our approach on two tasks, Task-1: detecting hate speech in Bengali text classifying it as hateful or non-hateful and Task-2: categorizing hateful content into five distinct classes. For Task-1, all HateBertBN variants outperformed current transformer models, achieving an accuracy of 0.92 and a weighted F1-score of 0.92. In Task-2, the HateBertBN-MLP and HateBertBN-CNN variants achieved a notable 0.90 accuracy and weighted F1-score of 0.90, surpassing M-BERT, Distil-M-BERT, BanglaBERT, and XLM-R-Base. Although HateBertBN-LSTM performed slightly lower overall, it achieved strong F1-scores in the ReHS (0.93) and XenHS (1.00) categories. Overall, our hybrid model outperforms state-of-the-art approaches in both tasks, demonstrating its effectiveness and robustness.

키워드

Hate speechSocial mediaLarge language modelsBERTXLM-RCNNLSTMMLP
제목
HateBertBN: a hybrid transformer based model for Bangla hate speech detection across various social contexts
저자
Azhar, TanvirMahmud, TahsinHasan, Muhammad AsifUddin, Mohammed NazimPark, Seung-Bo
DOI
10.1007/s10791-025-09804-x
발행일
2026-01-08
유형
Article
저널명
DISCOVER COMPUTING
29
1