AI recognition of polymetric notation

The aim of the cur­rent project is to tran­scribe a musi­cal input (giv­en by a stream or a table of MIDI events) to the most com­pre­hen­si­ble poly­met­ric nota­tion. An exten­sion of this would indeed be to accept a sound sig­nal as input.

Polymetric nota­tion is one of the main fea­tures of the Bol Processor project, which is here con­sid­ered as musi­co­log­i­cal research rather than soft­ware devel­op­ment. Other fea­tures include tonal mod­el­ling and the time-setting of sound-objects.

Polymetric expres­sions are a con­ve­nient prod­uct of rule-based music com­po­si­tion. They are rep­re­sent­ed as strings of sym­bols that embody a semi-lattice struc­ture, here mean­ing a tree of sound objects with ties. Polymetric nota­tion, gen­er­al­ly speak­ing, is a sequence of poly­met­ric expres­sions that is processed as a sin­gle poly­met­ric expression.

(just for decoration)

To achieve this, we plan to train a trans­former (a type of neur­al net­work), using sets of poly­met­ric expres­sions along­side their asso­ci­at­ed MIDI file ren­der­ings, both as stan­dard MIDI files and in tables of events. As we require large datasets that cov­er a wide range of musi­cal styles, the plan is to cre­ate these sets using exist­ing musi­cal scores.

The process described on the Importing MusicXML scores page demon­strates the abil­i­ty to "trans­late" any Western musi­cal score (in its dig­i­tal for­mat) to a poly­met­ric expres­sion. This process is irre­versible because poly­met­ric expres­sions can­not be ful­ly con­vert­ed into human-readable musi­cal scores, despite them car­ry­ing time struc­tures that sound "musi­cal". A typ­i­cal exam­ple of this is the use of unde­ter­mined rests, as these imply fur­ther pro­cess­ing of the struc­ture in order to set exact durations.

Datasets that asso­ciate poly­met­ric expres­sions with their MIDI file ren­der­ings con­tain iden­ti­cal pitch and tim­ing infor­ma­tion on both sides. Since both descrip­tions are com­plete and not redun­dant, the match­ing is a game of full infor­ma­tion.

At a lat­er stage, the trans­former should also be able to han­dle streams of MIDI events cre­at­ed by humans or ran­dom effects, where the tim­ings are not based on a sim­ple frame­work. Therefore, a quan­ti­za­tion of tim­ings is need­ed to adjust the stream before it is analysed. This quan­ti­za­tion is already oper­a­tional on the Bol Processor — see the Capture MIDI input page.

The creation of datasets

Datasets are cre­at­ed from musi­cal works import­ed from MusicXML scores. For instance:

Click the CREATE DATASET FOR AI button.

A dataset is cre­at­ed and can be down­loaded in a zip file:
set_-da.Hungarian_Rhapsody_OguzSirin.zip.

More sets can be cre­at­ed using the same musi­cal work. Clicking the CREATE DATASET FOR AI but­ton again would pro­duce the same set, as it is built from a sequence of ran­dom num­bers that is not reini­tialised. To ensure the new set is dif­fer­ent, click the refresh but­ton. When down­load­ing it, the sys­tem will auto­mat­i­cal­ly assign it a dif­fer­ent name, e.g.:
set_-da.Hungarian_Rhapsody_OguzSirin (1).zip.

Datasets of minimised polymetric structures

The mini but­ton close to the refresh but­ton mod­i­fies the dataset so that all (eli­gi­ble) poly­met­ric struc­tures are min­imised. In a min­imised struc­ture, some rests with explic­it dura­tions are replaced with unde­ter­mined rests (notat­ed '') with­out any loss of tim­ing accu­ra­cy. Read the Minimising a poly­met­ric struc­ture page for more details. These sets are small­er in size than the ones they are derived from because only eli­gi­ble struc­tures have been retained.

Once sam­ples in the train­ing set have been min­imised, the CREATE DATASET FOR AI but­ton is changed to CREATE min­imised DATASET FOR AI.

The sets of min­imised struc­tures are down­loaded with spe­cif­ic names that men­tion the 'mini' fea­ture, such as:
set_-da.Hungarian_Rhapsody_OguzSirin_mini.zip

When train­ing an AI, these sets should be used sep­a­rate­ly from stan­dard sets because they are expect­ed to train the trans­former to guess the prop­er loca­tions of unde­ter­mined rests. Nevertheless, it could be inter­est­ing to com­pare relat­ed (stan­dard ver­sus min­imised) sam­ples in order to mod­el changes between stan­dard and min­imised poly­met­ric structures.

The content of a dataset

The first dataset cre­at­ed in this demo con­tains 160 sam­ples. These sam­ples are text files named 1.txt, 2.txt, etc., asso­ci­at­ed with MIDI files 1.mid, 2.mid, etc., and tables of events (see below) 1.tab, 2.tab, etc., 1.tsv, 2.tsv, etc. A text file whose name ends in "_units.txt" con­tains all the sam­ple text files, enabling these sam­ples to be copied in a sin­gle piece to a data project. Clicking on the word "sam­ples" dis­plays it in a pop-up window.

One of the text files con­tains for instance:

{_tempo(4/3) _vel(85){2,{3/16,D6& E6& D7&}{3/16,&D6,&E6,&D7}{1/8,C6,C7}{3/8,B5,B6}{1/8,A5,A6}{3/8,G#5,G#6}{1/8,A5,A6}{3/8,B5,B6}{1/8,G#5,G#6},{3/8,G#3}{1/8,D4}{3/8,B4}{1/8,E4}{3/8,E3}{1/8,B3}{3/8,G#4}{1/8,E4}}} {_tempo(4/3) _vel(85){2,{3/8,A5,A6}{1/8,G#5,G#6}{3/8,A5,A6}{1/8,B5,B6}{3/8,C6,C7}{1/8,B5,B6}{3/8,C6,C7}{1/8,D6,D7},{3/8,A3}{1/8,D4}{3/8,C5}{1/8,A4}{3/8,E3}{1/8,C4}{3/8,A4}{1/8,E4}}}

This poly­met­ric expres­sion cov­ers 2 mea­sures of the musi­cal score. Clicking the Refresh but­ton (or reload­ing the page) slices the score ran­dom­ly into chunks con­tain­ing between 1 and 5 mea­sures. The upper lim­it of 5 has been set arbi­trar­i­ly and may be revised at a lat­er date. Every time the Refresh but­ton is clicked, a new slice is cre­at­ed, result­ing in a dif­fer­ent dataset.

The idea is twofold: (1) the trans­former should be trained to recog­nise sequences of poly­met­ric expres­sions, and (2) tied notes may span more than one mea­sure. A tied note in the above exam­ple is the pair "D6& &D6" (read more). Each sam­ple con­tains only com­plete pairs of tied notes. This means that any notes whose ties are not present in the sam­ple are untied for the sake of consistency.

Note that the tim­ings of the poly­met­ric expres­sion and the MIDI file are iden­ti­cal, as the metronome is auto­mat­i­cal­ly set to a default tem­po of 60 beats per minute when the MIDI file sam­ples are cre­at­ed. In the above phrase, the tem­po is set to 4/3, which is equiv­a­lent to a metronome set at 80 bpm.

Tables of events

The tables of events pro­vide an easy-to-read rep­re­sen­ta­tion of the con­tents of MIDI files.

Two kinds of text files are created.

1) The file with exten­sion "tab" lists MIDI events in four columns. The left­most col­umn con­tains the tim­ing of each event in mil­lisec­onds. For instance, the fol­low­ing text (from Couperin's Les Ombres errantes):

{_vel(64) {25/6, D5 1/6 {2, G5} F5, {25/6, C5 B4 4/3 B4& &B4 C5& &C5 C5&}, {25/6, G3 1/6 G4 {1/8, Ab4 G4} {7/8, Ab4} _legato(20) A4}}} {_vel(64) {4, {2, &C5 D5& &D5 _legato(20) C5} _legato(0) C5 {1, B4 {2, C5 B4 A4} B4}, F5 Eb5 {1/8, _legato(20) Eb5} {7/8, _legato(0) D5} 1, _legato(0) {1/8, B4 A4} {7/8, B4} C5 F4 G4}} {_tempo(9/10) _vel(64) {4, {G4, C5} Eb4 {1, D4 1} _tempo(26/27) Eb4, Eb4 {3, 1 C4& &C4 B3 1 C4}, C4 C3 G2 C3}}

which can be min­imised to:

{_vel(64) {25/6,D5 {2,G5}F5,{25/6,C5 B4 B4& &B4 C5& &C5 C5&},{25/6,G3 G4 {1/8,Ab4 G4}{7/8,Ab4}_legato(20) A4}}}{_vel(64) {4,{2,&C5 D5& &D5 _legato(20) C5}_legato(0) C5 {1,B4 {2,C5 B4 A4}B4},F5 Eb5 {1/8,_legato(20) Eb5}{7/8,_legato(0) D5}1,_legato(0) {1/8,B4 A4}{7/8,B4}C5 F4 G4}}{_tempo(9/10) _vel(64) {4,{G4,C5}Eb4 {1,D4 1}_tempo(26/27) Eb4,Eb4 {3,1 C4& &C4 B3 - C4},C4 C3 G2 C3}}

is con­vert­ed to:

0 144 74 64
0 144 72 64
0 144 55 64
500 128 72 0
500 144 71 64
1000 128 74 0
1000 128 55 0
1001 128 71 0
1165 144 79 64
1165 144 67 64
1666 144 71 64
etc.

The "tab" table con­tains only NoteOn/NoteOff events. However, the MIDI file may con­tain con­trols, such as vol­ume, panoram­ic, pres­sure, pitch­bend, mod­u­la­tion, ped­al on/off, and more in the range 65 to 95, which are per­for­mance para­me­ters in the Bol Processor score. Therefore, a more detailed for­mat of tables of events is need­ed to list them.

2) The text file with exten­sion "tsv" con­tains a detailed list of (all) MIDI events car­ried by the MIDI file.

The fol­low­ing exam­ple is the open­ing of Listz's La Campanella.

{_tempo(97/60) _vel(52) {3,2 {1/2,D#6,D#7} {1/2,D#6,D#7},1/2 {1/2,_switchon(64,1) D#4,D#5} {1/2,D#4,D#5} {1/2,D#4,D#5} 1}} {_tempo(97/60) _vel(52) {3,{1/2,D#6,D#7} 3/2 {1/2,D#6,D#7} {1/2,D#6,D#7},1/2 {1/2,_switchoff(64,1) _switchon(64,1) D#4,D#5} {1/2,D#4,D#5} {1/2,D#4,D#5} -}}

Screenshot
Note that at time 1866ms, the ped­al is released and pushed again imme­di­ate­ly, as indi­cat­ed in the score below. The source is '0' by default and can be mod­i­fied by _part() con­trol parameters.
Note that the ped­al remains pressed at the end of this frag­ment, as indi­cat­ed in the "tsv" table.

Since the Bol Processor can cap­ture a MIDI stream and cre­ate a "tsv" table of events with quan­tized tim­ings (see Capture MIDI input), train­ing a trans­former to con­vert these tables into poly­met­ric struc­tures could elim­i­nate the need to decrypt MIDI files and quan­tize their timings.

A collection of datasets

Several col­lec­tions of datasets can be found in the fold­er:
https://bolprocessor.org/misc/AI/samples

These can be used for train­ing a trans­former of your choice. We rec­om­mend cre­at­ing more sets from more musi­cal works to achieve a bet­ter train­ing. Run the Bol Processor and browse import­ed MusicXML scores in the ctests/Imported_MusicXML work­space, or import more works from a ded­i­cat­ed MusicXML server.

A test of the transformer's abil­i­ty to "trans­late" MIDI files (or tables of events) to poly­met­ric nota­tion will first be to retro-convert (to poly­met­ric expres­sions) all sam­ples. Then, if suc­cess­ful, retro-convert the MIDI files of com­plete musi­cal works used for the training.

What follows…

Once the cor­rect trans­former type and opti­mum dataset size have been iden­ti­fied, we will work on the fol­low­ing extensions:

  1. Translate MIDI streams pro­duced by human inter­preters for which pre­cise tim­ing is not guaranteed.
  2. (Optional) Extend the AI recog­ni­tion to the use of unde­ter­mined rests as it pro­vides a more sim­ple poly­met­ric structure.
  3. Use a sound to MIDI con­vert­er to con­vert a sound input to poly­met­ric notation.
  4. Let us assume that the musi­cal input con­sists of frag­ments sep­a­rat­ed by silences. Convert these frag­ments into poly­met­ric struc­tures and then search for reg­u­lar­i­ties in the rhyth­mic and tonal struc­tures to cre­ate vari­a­tions in the same style. Formal gram­mars will be employed for this purpose.

🛠 Work in progress! Please con­tact us to participate.

Leave a Reply

Your email address will not be published. Required fields are marked *