This article is from Music Tools / Libraries / Technology Advent Calendar 2019 12/24.
(This time I prepared it in a hurry, so it may not fit the purpose a little. Next year, I would like to do my own sound source and audio compression.
I have been using DTM for a long time with the combination of Music Studio Producer, which was introduced 10 years ago, and Timidity ++ and SoundFont, which was introduced 8 years ago. However, due to the compatibility problem between the unupdated MIDI driver of Timidity and Windows 10, we finally decided to abolish Timidity ...
Ten years ago, I created a mapping file manually, but in the last 10 years, I have supported programming, so this time I am talking about trying to get a comfortable DTM environment with the power of the program.
When using a SoundFont in a DAW that can only handle MIDI, use a virtual sound source that can use a SoundFont such as Timidity as a MIDI sound source. The tone of the SoundFont can be specified by combining ** bank number and preset number **, and any tone in the SoundFont can be specified by MIDI message to support MIDI ** bank select and program change **. You can choose.
However, if multiple SoundFonts are used at the same time, banks and preset numbers may collide, and depending on the SoundFont, the arrangement of tones specified by MIDI may differ significantly, which is inconvenient and mapping is performed.
Fortunately Timidity has this feature and we've placed it for ease of use. (Since I was using it as a user at that time, of course I didn't know the file format of SoundFont, and I wasn't familiar with MIDI.)
Arrange the tones on Excel in a space of 256 * 256, and set the Timidity setting file and DAW tone file (instrument name, program number, bank number correspondence table. If set, the instrument name will be displayed in the DAW GUI. I created it because it is displayed.) I wrote it by hand.
There were a lot of SoundFonts I wanted to add, but I assigned them in Excel and added the config file by hand ... it's annoying.
After a major update of Win10 around this summer, Timidity ++ finally stopped working. In the first place, Timidity ++ is running with an unsigned driver, and it has not been updated, so I thought it was about to be the limit and switched
Virtual MIDI Synth has been introduced as a virtual MIDI sound source for transfer.
However, ** there is no SoundFont mapping function **. It is possible to load SoundFonts from multiple files, but it seems that the sound of the SoundFont file with high priority will be used in the event of a collision. This was a problem.
So, let's rewrite the SoundFont file to prevent collisions, and then automatically generate the tone map. <Long introduction
I added the deletion of the tone color because some of the SoundFont files contained garbage tone color data, and the omission of the tone color name was added because the tone color name was too long and the DAW display became strange.
Note that the Music Studio Producer and Virtual MIDI Synth configuration files are not explained here.
Specifications: http://freepats.zenvoid.org/sf2/sfspec24.pdf
SoundFonts are stored in RIFF format. RIFF stores data in units called chunks, and chunks are made up of IDs, sizes, and data.
** Basic structure of chunks **
item | size | Remarks |
---|---|---|
Chunk ID | 4byte | Chunk identifier(RIFF/LIST etc.) |
Data size | 4byte | Data size (little endian) |
data | Nbyte |
In addition, the first chunk, the RIFF chunk, and the LIST chunk, which combines multiple chunks, are prepared as special chunks. (Chunks other than RIFF and LIST cannot contain chunks.)
** RIFF chunk structure **
item | size | Remarks |
---|---|---|
Chunk ID | 4byte | RIFF |
Data size | 4byte | N+4 |
File identifier | 4byte | Identifier of the data stored in the RIFF file(For SoundFont sfbk) |
data | Nbyte | Contains chunks and LIST chunks |
** LIST chunk structure **
item | size | Remarks |
---|---|---|
Chunk ID | 4byte | LIST |
Data size | 4byte | N+4 |
List identifier | 4byte | Identifier of the data stored in the list(INFO/data etc.) |
data | Nbyte | Contains chunks and LIST chunks |
RIFF files can use these chunks to represent nested structure data.
Since the size of the data part is written at the beginning of all chunks, unnecessary chunks can be skipped. Therefore, the purpose can be achieved by implementing that only the chunks related to the bank number and preset number of the SoundFont chunks are processed and the chunks after that are skipped as they are.
The RIFF structure of the SoundFont is as follows. Of these, the chunks under pdta contain the instrument name and preset number.
pdta contains sub-chunks for ** presets **, ** instruments **, and ** samples **. Of these, an instrument is a unit that is used inside a SoundFont as a unit that combines multiple samples, and a preset is a unit that is used by a user as a group of multiple instruments. Therefore, this time we will only use preset-related sub-chunks.
Note that sub-chunks are stored as an array of structures, and the value at the end is a special value that indicates the end. Also, the size is an integral multiple of sizeof (structure).
The phdr sub-chunk contains header information (preset instrument name, bank, preset number, etc.).
struct phdr {
char achPresetName[20]; //Preset name null terminating ascii
WORD wPreset; //Preset number
WORD wBank; //Bank number 0~127 for musical instruments 128 for percussion
WORD wPresetBagNdx; //index at the beginning of pbag
DWORD dwLibrary; //Reservation 0
DWORD dwGenre; //Reservation 0
DWORD dwMorphology; //Reservation 0
}
Note that wPresetBagNdx must be incremented in order from the beginning of phdr.
Initially, I overlooked this specification, and thought that it would be okay if I rewrote only phdr and erased unnecessary ones, and as a result of implementing it, the sound was different. You also need to edit the pbag, pmod, and pgen subchunks to meet this specification.
The value of the end (EOP) of phdr is as follows.
Variable name | value |
---|---|
achPresetName | EOP |
wPreset | 0 |
wBank | 0 |
wPresetBagNdx | index at the end of pbag |
dwLibrary | 0 |
dwGenre | 0 |
dwMorphology | 0 |
The pbag subchunk contains information that indicates which modulation (pmod) and generator (pgen) to use in the preset. The association between a preset and pbag is from the pbag pointed to by wPresetBagNdx of one preset to the pbag of wPresetBagNdx-1 of the next preset. (Therefore, it is possible to associate multiple pbags with one preset)
struct pbag {
WORD wGenNdx; //index at the beginning of pgen
WORD wModNdx; //index at the beginning of pmod
}
Like phdr, wGenNdx and wModNdx need to be incremented from the beginning of the pbag.
The value at the end of pbag is as follows.
Variable name | value |
---|---|
wGenNdx | index at the end of pgen |
wModNdx | index at the end of pmod |
The pgen sub-chunk stores parameter information (generators) such as instruments, volumes, and filters associated with presets.
The contents are in the key value format of parameter types and values.
struct pgen {
WORD sfGenOper; //Parameter type
WORD genAmount; //Parameter value
}
Note that genAmount contains two byte, short, or word type values depending on the type of parameter. (The size is fixed to word.)
The value at the end of pbag is as follows.
Variable name | value |
---|---|
sfGenOper | 0 |
genAmount | 0 |
The pmod sub-chunk contains information that associates how the sound changes (changes volume, filters) from dynamic parameters such as MIDI control changes and velocities.
struct pmod {
WORD sfModSrcOper; //Modulation source parameter type(CC, velocity, etc.
WORD sfModDestOper; //Types of parameters to operate(Volume, filter strength, etc.)
SHORT modAmount; //Operation amount
WORD sfModAmtSrcOper; //Types of modulation source parameters that change the amount of modulation manipulation
WORD sfModTransOper; //Convert the input operation amount(Linear, curved)
}
The value at the end of pmod is as follows.
Variable name | value |
---|---|
sfModSrcOper | 0 |
sfModDestOper | 0 |
modAmount | 0 |
sfModAmtSrcOper | 0 |
sfModTransOper | 0 |
Looking at the relationship between each sub-chunk, it looks like this.
For example, in the example of this figure, preset 0 is associated with bag0 and bag1, bag0 is associated with gen0, gen1 and mod0, and bag1 is associated with gen2 with mod1 and mod2, so the generator used in the preset is gen0 ~ The image of gen2 and modulation is mod0 ~ mod2.
(It may not be correct because it does not read the specifications, but it seems that the generator and modulation are associated with each bag and make a sound, but you do not have to worry too much as far as you can touch the file.)
Source code: https://github.com/mmitti/sf2conv/blob/master/riff.py
I have created a script using Python and the struct module that can parse (part of) the structure of RIFF and SoundFont.
It supports reading and writing RIFF chunks, LIST chunks, phdr, pbag, pmod, and pgen sub-chunks. Also, since the other chunks are not edited, the read one is written as it is.
When deleting phdr, it is necessary to delete the corresponding ones for pbag, pmod, and pgen, so the update process is performed at the time of writing.
(I implemented it during the waiting time of the driving school during the summer vacation, but it's dirty now. What is RiffRoot or Element?
Source code: https://github.com/mmitti/sf2conv/blob/master/main.py
I made a script that converts the SoundFont using the above Riff parser (or rather the SoundFont parser) and spits out the tone map of MusicStudio Producer and the configuration file of Virtual MIDI Synth.
Simply write the SoundFont to be input to the json file, the tone to be excluded, the replacement rule of the tone name, etc., rewrite the SoundFont file using this, and spit out the tone name, bank, and program number to the tone map. It is a script.
There are sounds with strange names (such as ------) and SoundFonts with wind instruments assigned to the program numbers of the piano, and as a result of increasing the rules that can be set, the number of setting items has increased. However, once I wrote the setting file, the tones are assigned to the empty parts and added to the DAW's tones list, making it easier to add new SoundFonts.
This time, because the DTM environment was damaged, I had to look inside the SoundFont, which deepened my understanding a little. Since it became easier to add SoundFonts with the script, I introduced sinfon that I wanted to use immediately and typed in one song.
After the FPGA USB MIDI device I'm making is completed, I'd like to make a MIDI sound module that reads SoundFonts, so I may research and write about SoundFonts soon.
See you again