M2C: Concise Music Representation for 3D Dance Generation

Citations

WEB OF SCIENCE

2
Citations

SCOPUS

6

초록

Generating 3D dance motions that are synchronized with music is a difficult task, as it involves modelling the complex interplay between musical rhythms and human body movements. Most existing approaches focus on improving the dance generation network, often overlooking the importance of the music feature processing stage which plays a crucial role in dance motion generation. In this paper, we propose music codes, a better latent representation for music features using discrete variables. We present a comprehensive analysis of the music features and propose a different normalization procedure to address the scale imbalance issue within music features. We also introduce the Music-to-Codes (M2C) network, a VQ-VAE inspired network as a music code extractor to replace existing music feature processors. To evaluate the effectiveness of our approach, we combine M2C with Stochastic Motion GPT (SM-GPT), our modification of a recent SoTA dance generation method. Our extensive evaluation and ablation study demonstrates that our dance generation pipeline (using M2C and SMGPT) significantly improves the dance generation result both qualitatively and quantitatively across all evaluation metrics. Our work opens up new possibilities for exploring the relationship between music and dance, contributing to more effective music-conditioned 3D dance generation.

키워드

dance predictiondiscrete representationM2Cmusic conditioned dance predictionvq vaeMFCC
제목
M2C: Concise Music Representation for 3D Dance Generation
저자
Marchellus, MatthewPark, In Kyu
DOI
10.1109/ICCVW60793.2023.00337
발행일
2023
유형
Proceedings Paper
저널명
IEEE International Conference on Computer Vision Workshop (ICCVW)
페이지
3118 ~ 3127