TA-SBERT: Token Attention Sentence-BERT for Improving Sentence Representation

Citations

WEB OF SCIENCE

27
Citations

SCOPUS

45

초록

A sentence embedding vector can be obtained by connecting a global average pooling (GAP) to a pre-trained language model. The problem of such a sentence embedding vector using a GAP is that it is generated with the same weight for all words appearing in the sentence. We propose a novel sentence embedding-method-based model Token Attention-SentenceBERT (TA-SBERT) to address this problem. The rationale of TA-SBERT is to enhance the performance of sentence embedding by introducing three strategies. First, we convert the base form while preprocessing the input sentence to reduce misunderstanding. Second, we propose a novel Token Attention (TA) technique that distinguishes important words to produce more informative sentence vectors. Third, we increase stability of fine-tuning to avoid catastrophic forgetting by adding a reconstruction loss to the word embedding vector. Extensive ablation studies demonstrate that our TA-SBERT outperforms the original SentenceBERT (SBERT) in the sentence vector evaluation using semantic textual similarity (STS) tasks and the SentEval toolkit.

키워드

Bit error rateVocabularyTask analysisContext modelingTokenizationData modelsTransformersNatural language processingsentence representationsemantic textual similarityBERTRoBERTa
제목
TA-SBERT: Token Attention Sentence-BERT for Improving Sentence Representation
저자
Seo, JaejinLee, SangwonLiu, LingChoi, Wonik
DOI
10.1109/ACCESS.2022.3164769
발행일
2022
유형
Article
저널명
IEEE Access
10
페이지
39119 ~ 39128