Distilling-DDSP

Distilling DDSP: Exploring Real-Time Audio Generation on Embedded Systems

View the Project on GitHub gregogiudici/distilling-ddsp

Wavetable

Wavetable synthesis creates sound by cycling through pre-defined waveforms stored in memory. These waveforms are stored in tables, called wavetables, which capture the spectral characteristics of a sound.

An interpolation function $\mathbb{f}(w;\phi[n])$ can be used to retrieve an arbitrary value of the stored waveform. For example, linear interpolation is defined as:

$ \mathbb{f}(w; \phi[n]) = w[i] + \alpha \cdot \left(w[i+1] - w[i]\right) % ] $

Where the index $i$ is calculated from the phase $\phi[n]$ as the integer part, $w[i]$ and $w[i+1]$ are consecutive values stored in the wavetable, and $\alpha$ is the interpolation factor that blends between these waveform values, typically equal to the fractional part of the phase. First introduced in the 1980s by PPG and later popularized by synthesizers like Waldorfโ€™s Wave, wavetable synthesis offers computational efficiency and rich timbral variety.

By interpolating between multiple wavetables, this synthesis technique enables dynamic timbre transitions and richer generation. Based on this approach, a signal $x[n]$ can be expressed as a sum of multiple wavetable oscillators:

$ x[n] = \sum_{k=1}^M A_k[n]\mathbb{f}(w_k; \phi_k[n]) $

Where $A_k[n]$ is the amplitude of the $k$-th wavetable oscillator, and $M$ is the number of wavetables. Wavetable synthesis is a staple in digital synthesizers thanks to its dynamic sound design capabilities, versatility, and ease of use.

DDSP Implementation

DDSP Implementation Diagram The Wavetable architecture employs a decoder, formed of recurrent and fully connected layers, conditioned on a sequence of pitch ($f_0$) and loudness ($L$) frames to predict the overall amplitude of the audio signal ($A$), the amplitude of each wavetable oscillator used to for mixing the various contributions ($a_k$), and the coefficients of the filter used to model the noise component ($h$). In addition, the values stored in the tables are learned as well during the training.

Audio Examples

Reference Anchor (LPC)
๐Ÿชˆ Flute
๐ŸŽบ Trumpet
๐ŸŽป Violin
๐ŸŽน Piano

Wavetable

Full Reduced Reduced+AD Reduced+CD Reduced(w/prt)+CD
๐Ÿชˆ Flute
๐ŸŽบ Trumpet
๐ŸŽป Violin
๐ŸŽน Piano