상세 보기
DRAM ACT 및 PRE 지연 숨김을 통한 HBM-PIM의 처리량 최적화
- 김현우;
- 이어진
초록
Modern applications, such as Large Language Models (LLMs), increasingly demand high memory bandwidth, which is difficult to meet using conventional memory devices alone. The insufficiency in memory bandwidth causes significant performance bottlenecks, as excessive time is spent on data transfers between the host processor and memory. Processing in memory (PIM) architectures address this challenge by placing processing units (PUs) near memory banks, offloading tasks from the host and leveraging DRAM internal bandwidth. HBM-PIM is one of the PIM devices developed in practice, which features one PU per two banks and enables parallel operations across all PUs. In this paper, we conduct an in-depth analysis of HBM-PIM operations, taking into account the DRAM microarchitecture. A detailed examination of HBM-PIM's microarchitecture reveals that its characteristics are not fully exploited. Based on this insight, we propose optimization techniques that leverage the structural features of HBM-PIM and can be implemented without hardware modification. By modifying the order of instructions, adjusting data mapping, and loosening memory barriers, we minimize latency caused by DRAM row conflicts and improve the performance of HBM-PIM. Our optimizations yield average performance improvements of 1.15×, 1.43×, and 1.29× for GEMV, ADD/MUL, and ReLU operations, respectively.
키워드
- 제목
- DRAM ACT 및 PRE 지연 숨김을 통한 HBM-PIM의 처리량 최적화
- 제목 (타언어)
- Optimizing HBM-PIM Throughput through DRAM ACT and PRE Hiding
- 저자
- 김현우; 이어진
- 발행일
- 2025-07
- 유형
- Y
- 저널명
- 정보과학회논문지
- 권
- 52
- 호
- 7
- 페이지
- 557 ~ 571