Automatic melody generation using magenta / TensorFlow

background

This time I wanted to be able to manipulate MIDI data using Python, so while browsing various sites, I arrived at a tool called Magenta.

What is #Magenta? The other day, Google's new project called Magenta opened on Github. https://github.com/tensorflow/magenta

Magenta is a project that uses neural networks to generate art and music.

It aims to evolve machine learning creativity and become a community of artists and machine learning.

Recurrent neural network to compose

As the first installment of Magenta, a model of a recurrent neural network (RNN) that composes music has been released. It incorporates a technique called LSTM (Long Short-Term Memory). According to the magenta official website (http://magenta.tensorflow.org/)

It’s purposefully a simple model, so don’t expect stellar music results. We’ll post more complex models soon. This is a deliberately simplified model, and you can't expect to make outstanding music. However, we will continue to publish more complex models.

Magenta was released in 2016. It is necessary to study RNN and LSTM first, but I would like to utilize this tool by trying to use it for the time being.

environment

MacBookPro (macOS Catalina (10.15.4))
Docker for Mac

References

PyDataOkinawa/meetup017

-Extract only the sound of a specific instrument from the MIDI file and make it a separate file

Extract a specific part

We confirmed the MIDI data (title song of Hinatazaka46, 4 songs) purchased from the YAMAHA online shop.

In this way, it was divided into various tracks. (Drum part, accompaniment, etc.) As will be described later, learning was performed using the data as it is without doing anything. I confirmed the results, but it is not good if there are many genres of data used. Therefore, it is necessary to focus only on the melody.

So let's extract only the melody part using the Python package pretty-midi.

`terminal`


pip install pretty_midi

For reference, the resource allocation is as follows.

`cut_out.py`



import pretty_midi
import os

os.chdir("~/WorkSpace/MIDI_data/cut_out")

def func():
  midi_name = ["00", "01", "02", "03"]

  for m_name in midi_name:
    midi_data = pretty_midi.PrettyMIDI("resource" + m_name + '.mid')

    #Select the selected instrument number and save
    midi_track = midi_data.instruments
    # print("midi_track = {0}" .format(midi_track))
    for m_track in midi_track:
        print("midi_track = {0}" .format(m_track))

    print("select track-nunber => ")
    i = int(input())

    #Take out the selected instrument and make it an instance
    for instrument in midi_track:
        if instrument.program == i:
            ins = instrument

    #Create a Pretty MIDI object for new creation
    rev_en_chord = pretty_midi.PrettyMIDI()

    #Added to PrettyMIDI object
    rev_en_chord.instruments.append(ins)

    #save
    rev_en_chord.write('./result/ins_' + m_name + '.mid')

  return

func()

This is a code that refers to each MIDI file, specifies the instrument number, and saves only that track.

Various correspondence tables for MIDI note numbers. Rhythm instrument name in GM, GS, XG. Program number, instrument name and pronunciation range.

See the above site for instrument codes. Most people seem to play with instrument chords when touching MIDI data from programming.

Finally Learning

Start Docker and pull the magenta / tensorflow image. Since TensorBoard is also used, specify the Port number.

`Terminal`


docker run -it -p 6006:6006 -v /tmp/magenta:/magenta-data tensorflow/magenta

Basically, I did not specify it here because I proceeded with the procedure of the following site.

PyDataOkinawa/meetup017

However, I had to hit the shell script in the image, but it stopped with an error. I had never touched a shell script before, so it took a long time to decipher it. Perhaps because of a different environment, the name of the specified programming file was different, or the parameter settings were incorrect, resulting in an error. (I learned from shell scripts.)

For reference, learning was possible in my environment by making the following changes.

`build_dataset.sh`



convert_midi_dir_to_note_sequences \
    --midi_dir=$MIDI_DIR \
    --output_file=$SEQUENCES_TFRECORD \
    --recursive

######Changed to######

convert_dir_to_note_sequences \
    --midi_dir=$MIDI_DIR \
    --output_file=$SEQUENCES_TFRECORD \
    --recursive

Learning ... I also saw TensorBoard for the first time. I have to study more because I have no idea what is written. スクリーンショット 2020-04-16 17.29.32.png

スクリーンショット 2020-04-15 18.37.07.png

I tried turning it on a MacBook for the time being, but it seems to be difficult ... I need to be able to turn it at Deep Infinity next week ...

Since the number of steps was set to 1200, it took 1 hour and 30 minutes to learn with a MacBook Pro. The model used is Basic RNN スクリーンショット 2020-04-16 18.50.52.png

Accuracy increased to 0.8.

Generate

A MIDI file was generated using this training data. The number of steps was specified as 500. In addition, the number of generated files is five.

The generated MIDI data was converted to wav using timidity and uploaded to Google Drive.

Google Drive

Time has come here, but as far as I listen to the song, I feel that a melody line that is slightly reminiscent of the original song has been generated.

I tried using magenta / TensorFlow