상세 보기
Automated Korean-English Bilingual Highlighted Text Extraction Using HSV Segmentation
- Lee, Seungjun;
- Kim, Yeongjin;
- Baek, Minhyuk;
- Jang, Jaehyeok;
- Kim, Ajin;
- 외 5명
Citations
SCOPUS
0초록
This paper presents an automated system for extracting highlighted text from Korean-English documents using HSV-based color segmentation and OCR. By integrating HSV segmentation with linguistic correction methods, evaluation on a 600 -image dataset achieved mIoU of 0.8222 for highlight detection and 95.30% character-level accuracy (4.70% CER) for OCR. These results demonstrate that the combination of HSV segmentation and language-specific post-processing enables accurate and robust recovery of color-highlighted text for document digitization and analysis. © 2026 IEEE.
키워드
Bilingual; Highlighted Text; HSV; Multi-PSM; OCR; Tesseract
- 제목
- Automated Korean-English Bilingual Highlighted Text Extraction Using HSV Segmentation
- 저자
- Lee, Seungjun; Kim, Yeongjin; Baek, Minhyuk; Jang, Jaehyeok; Kim, Ajin; Kim, Yerin; Park, Seonghun; Choi, Seungyun; Kim, Namjoon; Lee, Hyukjae
- 발행일
- 2026
- 유형
- Conference paper
- 저널명
- 2026 International Conference on Electronics, Information, and Communication, ICEIC 2026