Research

Bridging AI and creativity

How human movement can capture musical creativity to direct AI music generation.

Current work

Discrete Diffusion for Symbolic Music

Unlike image diffusion models, ours operates directly on musical tokens. Each gesture influences the denoising process through cross-attention, yielding real-time melodic control.

Trained on 10 million+ melodies. 8 gesture types map to melodic direction and intensity.

Try the demo →
Research abstract (PDF) →

Architecture

Cross-attention conditioning

Gesture embeddings are attended to at each diffusion step.

# Gesture conditioning
cond = self.gesture_embed(gesture)
attn, _ = self.cross_attn(x, cond, cond)
x = x + attn