AI Melody Generator for Chords Progression Using Magenta

How to use, prepare the data and train a model of RNN by an example with Magenta Improv RNN

8 min readDec 1, 2020

Introduction

The objective of this blog is to teach people how to generate melodies using Deep learning with the Tensor Flow framework called Magenta (by Google). The magenta model we are going to use is Improv RNN, which is a model that uses a neural network with RNN architecture and creates melodies using as input a chord sequence and a primary melody (this can be just one note). This resource is going to treat 3 main topics: how to prepare the data to build a dataset, how to train your own model, and how to use Magenta Improv RNN pre-trained model, to generate melodies for a specific genre of music. After following this blog, you are going to have all the tools to use the Improv RNN model and support your composition process with AI. In the end, you will decide which of the two models, the pre-trained by magenta or the one trained by you, make a better job composing melodies.

About Magenta Improv RNN

As we can see in the Github of Improv RNN this model generates melodies using Recurrent Neural conditioning of the melodies on an underlying chord progression. This model is very interesting because it combines the Melody RNN and a similar approximation as in the paper “Melody Generation for Pop Music via Word Representation of Musical Properties” by Dr. Andrew Shin and their colleagues, which defines a way to relate the melodies and chords in as words for the model training.

This model has 3 configurations basically depending on what level of complexity we require for the chord representation. The basic and attention improv configuration uses a one-hot vector of 48 triads to represent chords. Finally, the Chord Pitches Improv uses 3 vectors to add a more complex notation to chords and more versatility. We will further explain other important facts later in this publication.

How does it work?

Improv RNN uses Recurrent Neural Network with LSTM architecture to process note sequences it uses nodes to remember properties of the melodies and their respective assignation of chords, it works as a combination of word processing and image processing deep learning algorithms. It remembers how the melody is related to the chords and the basic patterns of rhythmic and pitch characteristics. You can define the number of nodes you want to use and the buffer size to affect directly the results.

Create and Prepare the Data to Build a Dataset

In order to build a dataset to train your own Improv RNN model, you must know a couple of things first. This Magenta resource feeds with a special format of music called MusicXML that represents a music sheet basically with the notes and chords organized in the music staff with a standard. In this part, we are going to transform the data into three abstractions to get it ready for your model.

In this notebook, you will learn how to create your data, even if you are not a musician. Please follow the next steps to generate your data.

Important: For this tutorial, you will need to install a DAW (Digital Audio Workspace) with MusicXML editor preferably. For this example, we are going to be using for midi input Logic Pro X (but any other DAW with midi support is useful like Garageband, Cubase, pro tools, etc.) and MuseScore to manipulate the Chords in MusicXML. Please Install the requirements before following this tutorial. Any version should work fine but is recommendable to use the last ones.

Step 1: Record your Midi Files

This is the first transformation of the input data and the more natural process because it includes The higher layer from music to the first digital representation.

To start open your DAW, in my case Logic pro x, create a new midi track and record a melody of the genre of your preference that is famous. I recommend using a midi controller but almost in any DAW, you can build your take manually by clicking on the piano-roll. In our case, we used reggaeton popular melodies.

Picture 1: Example of a midi controller.

This process may look like in the next picture:

Picture 2: Logic Pro X session with a midi track recorded.

Once you have recorded enough midi files (in our case we did 35) you should export your data as midi files to follow with the next step.

Step 2: Convert your Midi Files to MusicXML and Add the Chords

This step is where you are going to add the chords to your generated data. The first thing you need to use is software that allows you to import the midi files and work with them as a music score. In my case, I am using MuseScore it is my recommendation to work with midis as music sheets and then import the files as MusicXML.

First, you have to open the MuseScore and import a midi file, it should look like the following image:

Picture 3: midi file imported to MuseScore.

Now place the right chords at the beginning of each bar of the music sheet, you have to be careful and paid attention to put correctly the chords because this is the input of the Improv model you are going to train. After placing the chords now you can export the file in MusicXML format.

Picture 4: Example of how to write a Chord manually in the music sheet.

After placing the chords click the play button in the toolbar to test the chords you choose to harmonize the melody and the synchronization. Finally, export the file as a MusicXML file.

This last step is very important because you must select the following format before exporting the file to convert the file:

Picture 6: Correct format to export the file from MuseScore

Repeat these instructions with all the midi files you have created in the last step.

Note: generate as many as melodies you can and is important to use samples with different harmonizations (chords) because, in the end, your model would be only able to generate melodies with chords you previously used in the files.

Step 3: Convert your MusicXML Files to NoteSequences

This is the last step to finish preparing the data for your model, Magenta Improv RNN does not understand directly the MusicXML format, that's why you have to convert all your library of MusicXML files to a compressed file with the .tfrecord extension that is basically a NoteSequence that contains all the files you generated in one simple file.

First, you must install all magenta Dependencies for python.

This step is very simple just locate the folder in your drive where are all your MusicXML files and then run the following command in a block of code in the terminal:

convert_dir_to_note_sequences \  --input_dir='/content/gdrive/My Drive/leadSheets' \  --output_file='/content/gdrive/My Drive/TMP/notesequences.tfrecord'\  --recursive

Replace the input_dir parameter with the route of your MusicXML files directory and the parameter output_file with the directory where you want to save the NoteSequences generated.

Train your Improv RNN model with your own generated data.

This is the last topic we are going to treat in this article, this is going to have almost the same information as in the Magenta Improv RNN Documentation to train and use the model. For this example, we are going to be using the attention_improv configuration.

You can find a detailed example in my notebook on Google Colab. Feel free to copy this example and try it by yourself.

Step 1: create SequenceExamples

You need to create 2 Sequence examples to train and if you want to, evaluate the model. Each SequenceExample will contain a sequence of inputs and a sequence of labels that represent the music sheets you previously created. Run the command below to extract lead sheets from our NoteSequences and save them as SequenceExamples.

improv_rnn_create_dataset \--config=attention_improv \--input='/content/gdrive/My Drive/TMP/notesequences.tfrecord' \--output_dir='/content/gdrive/My Drive/TMP/improv_rnn/sequence_examples' \--eval_ratio=0.10

Change the input parameter for the route of your notesequences file and the output_dir for the directory of your SequenceExamples. The key parameter is eval_ratio which defines what portion of the data is going to be used for evaluation, in this case, 10%.

Note: be careful with the configuration you chose because you should use the same parameter in all the next steps.

Step 2: Train the model with your data

This is going to be the longest step in time, but don't worry, you should wait for the model until it finishes the training process.

Run the following commands to train your model, these are the values I used:

improv_rnn_train \--config=attention_improv \--run_dir='/content/gdrive/My Drive/TMP/improv_rnn/logdir/run1' \--sequence_example_file='/content/gdrive/My Drive/TMP/improv_rnn/sequence_examples/training_lead_sheets.tfrecord' \--hparams="batch_size=64,rnn_layer_sizes=[128,128]" \--num_training_steps=5000

If you want to know in detail what each parameter does, read carefully the documentation.

The parameters we use were chosen according to my model, generally, if you have the same amount of data and similar samples you can use the same values. But, I invite you to play with it and experiment with new ones.

Step 3: Use your trained model

To use the model you have trained run the following block of code:

improv_rnn_generate \--config=attention_improv \--run_dir='/content/gdrive/My Drive/TMP/improv_rnn/logdir/run1' \--hparams="batch_size=64,rnn_layer_sizes=[128,128]" \--output_dir='/content/gdrive/My Drive/TMP/improv_rnn/generated33' \--num_outputs=1 \--primer_melody="[57]" \--backing_chords="Bm G D A" \--render_chords

The first important parameter you should change is the primer_melody which is a vector that represents a series of notes and silents events (-2 = no event, -1 = note-off event, values 0 through 127 = note-on event for that MIDI pitch) to give a start point for the melody you are going to generate. Instead of primer_melody, you can use primer_midi with the route of a midi file with a simple melody that follows the chord progression you are going to use. The Second parameter is backing_chords where you must choose the chord sequence you are going to use (chord sequence separated by a simple space eg. “Bm G D A”). Finally, the parameter num_outputs will determine the num of generated melodies you want to create.

To finish this post you can go to my notebook and explore a small embedded app that I built in the last block of code to just try my model. Remember to run all the steps in the notebook.

Use Improv RNN pre-trained Model

You can use the Magenta pre-trained model you can follow the instructions of the magenta Improv RNN in the following section.

In the end, you will decide what model generates better melodies for the genre you are looking for (In my case Reggaeton).

Conclusion

AI is every day becoming better and it can be very useful for music composition, today we learn how to use the Improv RNN model of magenta to generate melodies for a music genre conditioned by a sequence of chords, and train your own model. If you want to explore other models and more options you can go to the Magenta website and try all the options in the framework.