AI for Music Composition
MuseGAN
MuseGAN is a project on music generation. In essence, we aim to generate polyphonic music of multiple tracks (instruments) with harmonic and rhythmic structure, multi-track interdependency and temporal structure. To our knowledge, our work represents the first approach that deal with these issues altogether.
The models are trained with Lakh Pianoroll Dataset (LPD), a new multi-track piano-roll dataset, in an unsupervised approach. The proposed models are able to generate music either from scratch, or by accompanying a track given by user. Specifically, we use the model to generate pop song phrases consisting of bass, drums, guitar, piano and strings tracks.
Sample results are available here.
BinaryMuseGAN
BinaryMuseGAN is a follow-up project of the MuseGAN project.
In this project, we first investigate how the real-valued piano-rolls generated by the generator may lead to difficulties in training the discriminator for CNN-based models. To overcome the binarization issue, we propose to append to the generator an additional refiner network, which try to refine the real-valued predictions generated by the pretrained generator to binary-valued ones. The proposed model is able to directly generate binary-valued piano-rolls at test time.
We trained the network with Lakh Pianoroll Dataset (LPD). We use the model to generate four-bar musical phrases consisting of eight tracks: Drums, Piano, Guitar, Bass, Ensemble, Reed, Synth Lead and Synth Pad. Audio samples are available here.
Run the code
Prepare Training Data
Prepare your own data or download our training data
The array will be reshaped to (-1, num_bar, num_timestep, num_pitch, num_track). These variables are defined in config.py.
lastfm_alternative_5b_phrase.npy (2.1 GB) contains 12,444 four-bar phrases from 2,074 songs with alternative tags. The shape is (2074, 6, 4, 96, 84, 5). The five tracks are Drums, Piano, Guitar, Bass and Strings.
lastfm_alternative_8b_phrase.npy (3.6 GB) contains 13,746 four-bar phrases from 2,291 songs with alternative tags. The shape is (2291, 6, 4, 96, 84, 8). The eight tracks are Drums, Piano, Guitar, Bass, Ensemble, Reed, Synth Lead and Synth Pad.
Download the data with this script.
(optional) Save the training data to shared memory with this script.
Specify training data path and location in config.py. (see below)
Configuration
Modify config.py for configuration.
Quick setup
Change the values in the dictionary SETUP for a quick setup. Documentation is provided right after each key.
More configuration options
Four dictionaries EXP_CONFIG, DATA_CONFIG, MODEL_CONFIG and TRAIN_CONFIG define experiment-, data-, model- and training-related configuration variables, respectively.
The automatically-determined experiment name is based only on the values defined in the dictionary SETUP, so remember to provide the experiment name manually (so that you won't overwrite a trained model).
Run
python main.py
Github Repository :
▶️ DTube
▶️ IPFS