Improving scalability of parallel CNN training by adjusting mini-batch size at run-time

Sunwoo Lee

상세 보기

Improving scalability of parallel CNN training by adjusting mini-batch size at run-time

Sunwoo Lee

초록

Training Convolutional Neural Network (CNN) is a computationally intensive task, requiring efficient parallelization to shorten the execution time. Considering the ever-increasing size of available training data, the parallelization of CNN training becomes more important. Data-parallelism, a popular parallelization strategy that distributes the input data among compute processes, requires the mini-batch size to be sufficiently large to achieve a high degree of parallelism. However, training with large batch size is known to produce a low convergence accuracy. In image restoration problems, for example, the batch size is typically tuned to a small value between 16 ~ 64, making it challenging to scale up the training. In this paper, we propose a parallel CNN training strategy that gradually increases the mini-batch size and learning rate at run-time. While improving the scalability, this strategy also maintains the accuracy close to that of the training with a fixed small batch size. We evaluate the performance of the proposed parallel CNN training algorithm with image regression and classification applications using various models and datasets.

제목: Improving scalability of parallel CNN training by adjusting mini-batch size at run-time

저자: Sunwoo Lee

학회명: IEEE International Conference on Big Data