Prang: A MIDI Score Sampler on the ESP32S3

Updated on 2022-05-16

Trigger tracks from Type 2 MIDI files to add to your recording or performance

Introduction

I originally made Prang as an FL Studio plugin but there were some limitations, and I really wanted a little hardware device to do what this otherwise required a PC for.

FL Studio

It's easier to show you what it does than tell you, so I'll do that. Forgive my bad timing. These microswitches are hard on the fingers. The action is really stiff, and the final version of Prang will not use them. Instead, it will take its input from another MIDI device. Since I am waiting on the parts for that, it is not implemented yet.

What I'm doing here is playing the individual tracks when I hold the corresponding button. The prang.mid file has 4 tracks in it, and I'm basically triggering them as I want them to compose the simple little beat.

This version uses buttons hardwired to the device, but the final version will have MIDI jacks on it and accept the signals from another MIDI device, like a keyboard. Each key can trigger a different track that way, and it's far more comfortable and responsive than cheap microswitches!

This project currently requires an ESP32-S3 DevkitC from Espressif, but will work on an ESP32-S2 with some modification. It currently uses the secondary programmable USB port to emit MIDI to. I use MIDI-OX to monitor the incoming MIDI on my PC and route it to my sound card.

MIDI-OX

We will be using my pre-release SFX library to facilitate the MIDI functionality and my GFX library to handle the display.

SFX library GFX library

This project also uses ESPTinyUSB and ESP32Encoder under the /lib folder. These portions are not my work. I've distributed them with the project because they are not in the Platform IO repository that I know of - at least not the versions I need. I wrote a little shim to thunk my SFX midi_output interface with ESPTinyUSB, and that's the extent of my use of it. It handles the USB MIDI output. The other library handles the encoder knob.

ESPTinyUSB ESP32Encoder

Hardware Components

You'll need an ESP32S3 DevkitC board or similar. You'll need an ILI9341 display, although you can swap out for another 320x240 unit like an ST7789 by modifying the code a little. You'll need an encoder, an SD reader, and 4 microswitches. Wiring.txt has the details.

Background

This works a bit like the MIDI streamer I wrote about recently, but it's different enough that it deserves its own article. This device deals primarily with MIDI type 2 files. Currently, it treats all MIDI files as type 2 but that will change before the final version.

MIDI streamer

With MIDI type 2, each track is independent of the other tracks in the file, and may have its own tempo and patch settings, etc. Prang treats each track in the file as its own individually triggerable score. Currently, there are only 4 buttons but you could wire more. As I said, the final version will take MIDI input and so could theoretically support 128 tracks, memory permitting of course.

See the wiring.txt file for instructions on how to wire it. It's fairly simple, but there's still a lot of connections to be made regardless of how simple they are.

If you dig into the project you'll note there's some nonsense afoot with the platformio.ini. The nonsense is there to wedge Arduino support for the S3 into Platform IO. Just wave a dead chicken over it, and everything will be fine.

The trick here is to trigger tracks whenever the corresponding button gets pressed, and to stop it when the button gets released.

One wrinkle here, is the MIDI Type 2 specification suggests that each track can contain its own tempo. We use one midi_clock per track in order to facilitate this. The MIDI clock basically takes a timebase in ticks per quarter note, and a tempo in beats per minute and computes the duration of a "MIDI tick" in microseconds. It then uses that duration to call a callback function at each MIDI tick interval. The clock is cooperatively threaded, meaning it must be pumped using its update() method in order to function.

Another complication comes from stopping a track from playing. MIDI signals a note through note on and note off messages. Consequently, if you stop playing before a note off is sent that corresponds to a note on that was already sent, the note will be left hanging/playing ad infinitum. This is obviously undesirable. One option would be to send a MIDI kill but that would stop all playing notes, and we don't want to do that. Instead, we keep a 128-bit buffer with one bit per note, and then 16 of those, one for each channel. We process MIDI messages and use the aforementioned array to store when a note gets played or when it stops. When we stop a channel, we use that data to determine which notes are playing, and then we send note offs for each one.

Due to this being time critical, and due to the probably limited length of the tracks, we load each track into memory. We store the current input position, the current event, the clock, the buffer, the output (we may eventually support multiple) and some other information for each track.

Another thing we do a lot is juggle fonts. Particularly, the big messy looking font we use in red all over the place is only loaded when it's being used, and then it's unloaded to save RAM. On the other hand, the "system font" we use is more readable, but less fancy, and it's embedded as a header, again to spare RAM. I could have embedded both as headers, but I prefer to keep my assets on SPIFFS when it's realistic to do so in order to shorten upload times. I didn't want to have the system font ever be "unloaded" because you never know when you'll need to display some text throughout the app. Keeping a system font around makes sense.

All this juggling is necessary because I designed this to work without PSRAM, which I did because finding a release candidate S3 board where you can actually use the PSRAM is not easy. I ended up with two beta boards, and I've given up for now, because it's hard to tell which model/revision is in what stage of release.

Anyway, the core of the code is the midi_samper class which does most of what we went over above. We'll be spending most of the article on that, even though it's not really that complicated once you understand the individual components that make it go.

midi_sampler

As promised, we'll be covering midi_sampler. Its job is to load MIDI files, and then spin up tracks in a loop on demand, each at their own tempo and starting position within the performance.

The first thing we need to do is provide a facility to read a MIDI file. We could do this ourselves, or we could just use the SFX's midi_file class to do most of the work. We'll do that. The only downside is it requires a seekable stream to do it this way, but that's not a problem because we're not fetching MIDI files from over the Internet.

Once we have the offsets, then for each track, we allocate enough memory to hold the track information for every track, and then for each track we allocate yet more memory, and copy the track into it. It's potentially several little allocations, which is maybe less than ideal depending on your situation but the way we are using it, it's the main heart of the app, so we're going to give it priority anyway. Nothing else significant should load after it, so heap fragmentation shouldn't be a major issue. Even if it would be, this still behaves a lot better than the STL would, allocation-wise. We also allow you to pass in a custom allocator so you can use your own heap, or possibly PSRAM.

Anyway, as we do that, we have to initialize everything, prime our clock, and set the callback for it. Despite using several clocks, we only need one callback to handle them all.

sfx_result midi_sampler::read(stream* in,
                            midi_sampler* out_sampler,
                            void*(allocator)(size_t),
                            void(deallocator)(void*)) {
    if(in==nullptr ||
            out_sampler==nullptr||
            allocator==nullptr||
            deallocator==nullptr) {
        return sfx_result::invalid_argument;
    }
    if(!in->caps().read || !in->caps().seek) {
        return sfx_result::io_error;
    }
    midi_file file;
    sfx_result res = midi_file::read(in,&file);
    if(res!=sfx_result::success) {
        return res;
    }
    track *tracks =
        (track*)allocator(sizeof(track)*file.tracks_size);
    if(tracks==nullptr) {
        return sfx_result::out_of_memory;
    }
    for(int i = 0;i<file.tracks_size;++i) {
        track& t = tracks[i];
        t.buffer = nullptr;
    }
    for(int i = 0;i<file.tracks_size;++i) {
        track& t = tracks[i];
        midi_track& mt = file.tracks[i];
        t.buffer = (uint8_t*)allocator(mt.size);
        if(t.buffer==nullptr) {
            res= sfx_result::out_of_memory;
            goto free_all;
        }

        if(mt.offset!=in->seek(mt.offset) ||
                mt.size!=in->read(t.buffer,mt.size)) {
            res = sfx_result::io_error;
            goto free_all;
        }
        t.tempo_multiplier = 1.0;
        t.base_microtempo = 500000;
        t.clock.timebase(file.timebase);
        t.clock.microtempo(500000);
        t.clock.tick_callback(callback,&t);
        t.buffer_size = mt.size;
        t.buffer_position = 0;
        t.event.message.status = 0;
        t.event.absolute = 0;
        t.output = nullptr;
    }
    out_sampler->m_allocator = allocator;
    out_sampler->m_deallocator = deallocator;
    out_sampler->m_tracks = tracks;
    out_sampler->m_tracks_size = file.tracks_size;
    return sfx_result::success;
free_all:
    if(tracks!=nullptr) {
        for(size_t i=0;i<file.tracks_size;++i) {
            track& t = tracks[i];
            if(t.buffer!=nullptr)  {
                deallocator(t.buffer);
            }
        }
        deallocator(tracks);
    }
    return res;
}

Finally, if everything went off without a hitch, we assign the memory we just filled to the midi_sampler that was passed in. The last bit of the routine is error handling which just deallocates any allocations before returning an error.

We've filled some track structures but haven't talked about them, so let's segue into that briefly:

struct track {
    sfx::midi_clock clock;
    sfx::midi_event_ex event;
    note_tracker tracker;
    int32_t base_microtempo;
    float tempo_multiplier;
    uint8_t* buffer;
    size_t buffer_size;
    size_t buffer_position;
    sfx::midi_output* output;
};

As I mentioned, we keep a midi_clock for each track. The next member is the last event we pulled out of the stream, along with its absolute position in MIDI ticks. Next we have the note_tracker which handles tracking each pressed note so that note offs can be sent as needed when the track playback is stopped. The base_microtempo is the microtempo currently reported off of the track, before the tempo multiplier is applied to it. The tempo_multiplier indicates how much faster or slower the actual playback is compared to the base microtempo. The buffer and buffer_size fields indicate the buffer for our track data, while buffer_position indicates the input position within the track data. Finally, the output indicates the MIDI device that the messages will be sent to.

Let's cover some of the non-trivial stuff here, like start():

sfx_result midi_sampler::start(size_t index,
        unsigned long long advance) {
    if(0>index || index>=m_tracks_size) {
        return sfx_result::invalid_argument;
    }
    track& t = m_tracks[index];
    stop(index);
    if(advance) {
        const_buffer_stream cbs(t.buffer,t.buffer_size);
        t.clock.elapsed(advance);
        t.event.message.status = 0;
        t.event.absolute = 0;
        while(t.event.absolute<advance) {
            size_t sz = midi_stream::decode_event(true,&cbs,&t.event);
            if(sz==0) {
                break;
            }
            t.buffer_position+=sz;
            if(t.event.message.status==0xFF &&
                    t.event.message.meta.type==0x51) {
                int32_t mt = (t.event.message.meta.data[0] << 16) |
                    (t.event.message.meta.data[1] << 8) |
                    t.event.message.meta.data[2];
                // update the clock microtempo
                t.base_microtempo = mt;
                t.clock.microtempo(mt/t.tempo_multiplier);
            } else if(t.output!=nullptr) {
                switch(t.event.message.type()) {
                    case midi_message_type::program_change:
                    case midi_message_type::control_change:
                    case midi_message_type::system_exclusive:
                    case midi_message_type::end_system_exclusive:
                        t.output->send(t.event.message);
                    break;
                default:
                    break;
                }
            }
        }
    }
    t.clock.start();
    return sfx_result::success;
}

If it weren't for the "advance" portion, this method would be trivial. Advance helps us quantize. What it does is it allows you to start the playback a specified number of ticks into the track. We do so by setting the clocks elapsed tick count manually, and then we read all the events that come before the advance point. We throw most of them away, except for tempo change messages, program change, control change, and system exclusive messages. We send or process those to keep our loop cohesive. If we didn't process those, the sound or playback speed might be significantly different than what we expect.

The stop() method is simpler:

sfx_result midi_sampler::stop(size_t index) {
    if(0>index || index>=m_tracks_size) {
        return sfx_result::invalid_argument;
    }
    track& t = m_tracks[index];
    t.clock.stop();
    t.buffer_position = 0;
    t.event.absolute = 0;
    t.event.delta = 0;
    t.event.message.~midi_message();
    t.event.message.status = 0;
    t.base_microtempo = 500000;
    t.clock.microtempo(t.base_microtempo/t.tempo_multiplier);
    if(t.output!=nullptr) {
        t.tracker.send_off(*t.output);
    }
    return sfx_result::success;
}

All we do there is stop the clock, set everything to its initial state and then send note offs to any indicated output.

Now let's get to the magic, which is all handled in callback(). What we do here is play the current event if its position is the current position or earlier, and keep fetching events until that's not the case. When we get an event, we update the tempo if it's a tempo change, otherwise, if it's not an empty message (status of zero), we pass it to the note tracker and then the output. Then we go to read the next event, if there's more available on the stream. If not, we reset the track to its initial state and restart the clock, looping it:

void midi_sampler::callback(uint32_t pending,
        unsigned long long elapsed,
        void* pstate) {
    track *t = (track*)pstate;
    while(t->event.absolute<=elapsed) {
        if (t->event.message.type() ==
                midi_message_type::meta_event) {
            // if it's a tempo event update the clock tempo
            if(t->event.message.meta.type == 0x51) {
                int32_t mt = (t->event.message.meta.data[0] << 16) |
                    (t->event.message.meta.data[1] << 8) |
                    t->event.message.meta.data[2];
                // update the clock microtempo
                t->base_microtempo = mt;
                t->clock.microtempo(mt/t->tempo_multiplier);
            }
        }
        else if(t->event.message.status!=0) {
            t->tracker.process(t->event.message);
            if(t->output!=nullptr) {
                t->output->send(t->event.message);
            }
        }
        bool restarted = false;
        if(t->buffer_position>=t->buffer_size) {
            t->buffer_position = 0;
            t->event.absolute = 0;
            t->event.delta = 0;
            t->event.message.~midi_message();
            t->event.message.status=0;
            t->clock.stop();
            t->clock.microtempo(t->base_microtempo/t->tempo_multiplier);
            t->clock.start();
            restarted = true;
            if(t->output!=nullptr) {
                t->tracker.send_off(*t->output);
            }
        }

        const_buffer_stream cbs(t->buffer,t->buffer_size);
        cbs.seek(t->buffer_position);
        size_t sz = midi_stream::decode_event(true,(stream*)&cbs,&t->event);
        t->buffer_position+=sz;
        if(sz==0) {
            t->clock.stop();
        }
        if(restarted) {
            break;
        }
    }
}

note_tracker

Let's talk about the note_tracker. There isn't a lot to it. We keep two 64-bit unsigned integers around to hold 128 bits - 1 bit for each note. We keep 16 of those - one for each MIDI channel. We handle incoming messages by looking for note offs and note ons, and then setting or clearing that bit at that channel as necessary. When we get a stop request, we go through all of our bits and send a note off for each note that is currently active:

#include "note_tracker.hpp"
#include <string.h>
note_tracker::note_tracker() {
    memset(m_notes,0,sizeof(m_notes));
}
void note_tracker::process(const sfx::midi_message& message) {
    sfx::midi_message_type t = message.type();
    if(t==sfx::midi_message_type::note_off ||
            (t==sfx::midi_message_type::note_on &&
                    message.msb()==0)) {
        uint8_t c = message.channel();
        uint8_t n = message.lsb();
        if(n<64) {
            const uint64_t mask = uint64_t(~(1<<n));
            m_notes[c].low&=mask;
        } else {
            const uint64_t mask = uint64_t(~(1<<(n-64)));
            m_notes[c].high&=mask;
        }
    } else if(t==sfx::midi_message_type::note_on) {
        uint8_t c = message.channel();
        uint8_t n = message.lsb();
        if(n<64) {
            const uint64_t set = uint64_t(1<<n);
            m_notes[c].low|=set;
        } else {
            const uint64_t set = uint64_t(1<<(n-64));
            m_notes[c].high|=set;
        }
    }
}
void note_tracker::send_off(sfx::midi_output& output) {
    for(int i = 0;i<16;++i) {
        for(int j=0;j<64;++j) {
            const uint64_t mask = uint64_t(1<<j);
            if(m_notes[i].low&mask) {
                sfx::midi_message msg;
                msg.status =
                    uint8_t(uint8_t(sfx::midi_message_type::note_off)|uint8_t(i));
                msg.lsb(j);
                msg.msb(0);
                output.send(msg);
            }
        }
        for(int j=0;j<64;++j) {
            const uint64_t mask = uint64_t(1<<j);
            if(m_notes[i].high&mask) {
                sfx::midi_message msg;
                msg.status =
                    uint8_t(uint8_t(sfx::midi_message_type::note_off)|uint8_t(i));
                msg.lsb(j+64);
                msg.msb(0);
                output.send(msg);
            }
        }
    }
    memset(m_notes,0,sizeof(m_notes));
}

midi_esptinyusb.cpp

This is basically a shim to thunk calls to an SFX midi_output class to the ESPTinyUSB library for transmission of MIDI over USB. All it does is take a message apart and turn it into bytes for sending over the wire:

MIDIusb midi_esptinyusb_midi;
bool midi_esptinyusb_initialized = false;
bool midi_esptinyusb::initialized() const {
    return midi_esptinyusb_initialized;
}
sfx::sfx_result midi_esptinyusb::initialize(const char* device_name) {
    if(!midi_esptinyusb_initialized) {
        midi_esptinyusb_initialized=true;
#ifdef CONFIG_IDF_TARGET_ESP32S3
        midi_esptinyusb_midi.setBaseEP(3);
#endif
        char buf[256];
        strncpy(buf,device_name==nullptr?"SFX MIDI Out":device_name,255);
        midi_esptinyusb_midi.begin(buf);
#ifdef CONFIG_IDF_TARGET_ESP32S3
        midi_esptinyusb_midi.setBaseEP(3);
#endif
        delay(1000);
        midi_esptinyusb_initialized = true;
    }
    return sfx::sfx_result::success;
}
sfx::sfx_result midi_esptinyusb::send(const sfx::midi_message& message) {
    sfx::sfx_result rr = initialize();
    if(rr!=sfx::sfx_result::success) {
        return rr;
    }
    uint8_t buf[3];
    if(message.type()==sfx::midi_message_type::meta_event &&
            (message.meta.type!=0 || message.meta.data!=nullptr)) {
        return sfx::sfx_result::success;
    }
    if (message.type()==sfx::midi_message_type::system_exclusive) {
        // send a sysex message
        uint8_t* p = (uint8_t*)malloc(message.sysex.size + 1);
        if (p != nullptr) {
            *p = message.status;
            if(message.sysex.size) {
                memcpy(p + 1, message.sysex.data, message.sysex.size);
            }
            tud_midi_stream_write(0, p, message.sysex.size + 1);
            // write the end sysex
            *p=0xF7;
            tud_midi_stream_write(0, p, 1);
            free(p);
        }
    } else {
        // send a regular message
        // build a buffer and send it using raw midi
        buf[0] = message.status;
        switch (message.wire_size()) {
            case 1:
                tud_midi_stream_write(0, buf, 1);
                break;
            case 2:
                buf[1] = message.value8;
                tud_midi_stream_write(0, buf, 2);
                break;
            case 3:
                buf[1] = message.lsb();
                buf[2] = message.msb();
                tud_midi_stream_write(0, buf, 3);
                break;
            default:
                break;
        }
    }
    return sfx::sfx_result::success;
}

That just handles the gritty details of activating MIDI over USB and then as I said, packaging the message for wire transport.

main.cpp

This file is where we tie everything together. It's kind of a jungle, because the abstractions necessary to make it clean won't pay for themselves in the very last mile. What I mean is, you can abstract much of your app, as we have above, but your main.cpp - your "glue", is always going to be glue. You can make it into pretty glue, but it's still glue. In our case, we have forgone any sort of higher level UI widgets which could have cleaned main.cpp up substantially, but not without a lot more effort than is worthwhile for the amount of order it would provide.

One blessing of main, is it's basically all sequential. As the app flows forward, the code moves simply from top to bottom without jumping all over the place. There's no central dispatcher like there is in many of my apps - instead of multiple screens, it's basically just the one screen even though it changes periodically.

We've also hijacked setup() instead of using loop(). One of the reasons for this is the setup() task gets significantly more stack space allocated to it than the loop() task. Another reason has to do with the frankly, poor design of the Arduino framework in this instance. Separating loop() as a separate routine forces any shared data to be held as globals. Globals are fine, I'm not complaining about using a bunch of globals here, but it complicates initialization substantially, particularly since many of the calls necessary to initialize those globals cannot be made until after setup() begins! This is why Arduino classes almost all have a begin() method. There are other ways to work around this issue, such as creating a struct to hold shared data, initializing it in setup using malloc(), and holding a pointer to it as a global but it doesn't solve the stack space issue.

We have some boilerplate definitions at the top, some of which will look familiar if you've used GFX before. That's mostly just to establish our ILI9341 and SPI bus connections.

Let's start with the more interesting stuff that follows. First, our globals:

lcd_t lcd;
ESP32Encoder encoder;
int64_t encoder_old_count;
float tempo_multiplier;
midi_sampler sampler;
uint8_t* prang_font_buffer;
size_t prang_font_buffer_size;
midi_esptinyusb out;
int switches[4];
int follow_track = -1;
RingbufHandle_t signal_queue;
TaskHandle_t display_task;

The first one is our display. We draw to this.

The second one is our encoder driver. After that, we keep the old count reported by the encoder driver so we detect when it has changed.

We also keep the tempo_multiplier, because we need a pointer to it that will always be valid.

Next is our sampler. This is basically the heart of the app logic.

After that, we have a couple of members that hold our memory buffer for the big messy font we use in red. This font is loaded from SPIFFS into memory whenever it is needed, so this buffer potentially gets allocated and freed more than once over the life of the app.

After that, we have our midi_esptinyusb MIDI output driver. This is where the MIDI data gets sent.

Now we have an array for our switches. These are the pushbuttons we use to control which tracks are playing and when. This array holds the old/current switch value so we can detect a change.

After that, we have follow_track. This is a little weird. Basically it's for quantization. We need a reference track to align the beats of other tracks to since we quantize to the nearest beat. If no tracks are playing, this value will be -1 and the first track to be played becomes the reference track that the other tracks sync to when played. When that track stops playing, the track is switched to one of the other playing tracks. This way, we always have a tick reference we can synchronize beats to, except when nothing is playing, in which case the first track to be played becomes the new reference track for other tracks to follow.

After that, we keep a ring buffer signal_queue. This is a thread safe message passing scheme used to update the display, since we do that on the second core to avoid interrupting MIDI playback.

The display_task is the actual task used for updating the display as per above.

After the globals, we have a draw_error() helper method that simply draws an error message to the screen using the big messy font if available, otherwise using the system font, which is always available.

Now onto setup()!

First, we initialize our buttons:

pinMode(P_SW1,INPUT_PULLDOWN);
pinMode(P_SW2,INPUT_PULLDOWN);
pinMode(P_SW3,INPUT_PULLDOWN);
pinMode(P_SW4,INPUT_PULLDOWN);
memset(switches,0,sizeof(switches));

Next, we initialize the encoder:

ESP32Encoder::useInternalWeakPullResistors=UP;
encoder_old_count = 0;
encoder.attachFullQuad(ENC_CLK,ENC_DT);

Now, we initialize our Serial, display, SPIFFS and SD:

Serial.begin(115200);
SPIFFS.begin();
// ensure the SPI bus is initialized
lcd.initialize();
SD.begin(SD_CS,spi_container<0>::instance());

Since the SD reader piggybacks the same bus used by the display, we initialize the display first to ensure that the SPI gets initialized to the proper pins. We then pass the SPI instance for SPI host zero - the same one used by the display - to the SD when we initialize it.

Set our tempo multiplier to 1.0:

tempo_multiplier = 1.0;

Create the ring buffer:

signal_queue = xRingbufferCreate(sizeof(float) *
    8 + (sizeof(float) - 1),
    RINGBUF_TYPE_NOSPLIT);
if(signal_queue==nullptr) {
    Serial.println("Unable to create signal queue");
    while(true);
}

This is our messaging system we use to allow the main task/thread to update the display task/thread. The size computation is a guess on my part, because the documentation isn't clear on how much extra space is needed to avoid a split. This formula has served me well in the past, though.

Next, we create the display task using a "flat" lambda. In the lambda, we just try to pull a message out of the queue and if we got one, we update the tempo display.

if(pdPASS!=xTaskCreatePinnedToCore([](void* state){
    float scale = Telegrama_otf.scale(20);
    while(true) {
        size_t fs=sizeof(float);
        float* pf=(float*)xRingbufferReceive(signal_queue,&fs,0);
        if(nullptr!=pf) {
            float f = *pf;
            vRingbufferReturnItem(signal_queue,pf);
            char text[64];
            sprintf(text,"tempo x%0.1f",f);
            ssize16 sz = Telegrama_otf.measure_text(ssize16::max(),
                         spoint16::zero(),text,scale);
            srect16 rect = sz.bounds().center_horizontal
                           ((srect16)lcd.bounds()).offset(0,3);
            draw::filled_rectangle(lcd,srect16
            (0,rect.y1,lcd.dimensions().width-1,rect.y2),color_t::white);
            draw::text(lcd,rect,spoint16::zero(),text,Telegrama_otf,
                       scale,color_t::black,color_t::white,false);
        }
    }
},"Display Task",4000,nullptr,0,&display_task,1-xPortGetCoreID())) {
    Serial.println("Unable to create display task");
    while(true);
}

You may have noticed that the tempo is a multiplier rather than a straight beats-per-minute figure. The reason being that each track can have its own tempo, so we don't necessarily have one tempo we can display, but we do apply the same multiplier to every track, at least in the current rendition so we can display that figure instead. When I create these MIDI files, I strip any tempo messages from the file so that they all baseline at 120.0 BPM - the MIDI default. You can then adjust that using the encoder.

After that, we have the main application logic, starting with the splash screen. We mark it with a restart label so we can jump back to the beginning in case we need to, such as when there is an error.

After the label, we display the splash screen by drawing the MIDI JPG off of SPIFFS, and then loading the messy font and writing the title:

lcd.fill(lcd.bounds(),color_t::white);
File file = SPIFFS.open("/MIDI.jpg","rb");
draw::image(lcd,rect16(0,0,319,144).center_horizontal(lcd.bounds()),&file);
file.close();
file = SPIFFS.open("/PaulMaul.ttf","rb");
file.seek(0,SeekMode::SeekEnd);
size_t sz = file.position()+1;
file.seek(0);
prang_font_buffer=(uint8_t*)malloc(sz);
if(prang_font_buffer==nullptr) {
    Serial.println("Out of memory loading font");
    while(true);
}
file.readBytes((char*)prang_font_buffer,sz);
prang_font_buffer_size = sz;
file.close();
const_buffer_stream fntstm(prang_font_buffer,prang_font_buffer_size);

open_font prangfnt;
gfx_result gr = open_font::open(&fntstm,&prangfnt);
if(gr!=gfx_result::success) {
    Serial.println("Error loading font.");
    while(true);
}
const char* title = "pr4nG";
float title_scale = prangfnt.scale(200);
ssize16 title_size = prangfnt.measure_text(ssize16::max(),
                                        spoint16::zero(),
                                        title,
                                        title_scale);
draw::text(lcd,
        title_size.bounds().center_horizontal((srect16)lcd.bounds()).offset(0,45),
        spoint16::zero(),
        title,
        prangfnt,
        title_scale,
        color_t::red,
        color_t::white,
        true,
        true);

Now, while the splash screen is displaying, we first check to make sure the user has an SD card inserted, or prompt for one. Using a nasty little trick, we can automatically continue once it's inserted. We spin a loop and in it, we deinitialize and reinitialize the SD card, trying to read the root directory until it succeeds.

Anyway, once we successfully read the SD directory, we load each file in turn, counting the actual MIDI files, first.

Then we do it again. The first time through is just so we can get a count to see how much memory we need to allocate to hold the file list. The second time is so we can load the actual data into the memory we just allocated.

if(SD.cardSize()==0) {
    draw_error("insert SD card");
    while(true) {
        SD.end();
        SD.begin(SD_CS,spi_container<0>::instance());
        file=SD.open("/","r");
        if(!file) {
            delay(1);
        } else {
            file.close();
            break;
        }
    }

    free(prang_font_buffer);
    goto restart;
}
file = SD.open("/","r");
size_t fn_count = 0;
size_t fn_total = 0;
while(true) {
    File f = file.openNextFile();
    if(!f) {
        break;
    }
    if(!f.isDirectory()) {
        const char* fn = f.name();
        size_t fnl = strlen(fn);
        if((fnl>5 && ((0==strcmp(".midi",fn+fnl-5) ||
                    (0==strcmp(".MIDI",fn+fnl-5) ||
                    (0==strcmp(".Midi",fn+fnl-5))))))||
        (fnl>4 && (0==strcmp(".mid",fn+fnl-4) ||
                    0==strcmp(".MID",fn+fnl-4))  ||
                    0==strcmp(".Mid",fn+fnl-4))) {
            ++fn_count;
            fn_total+=fnl+1;
        }
    }
    f.close();
}
file.close();
char* fns = (char*)malloc(fn_total+1)+1;
if(fns==nullptr) {
    draw_error("too many files");
    while(1);
}
midi_file* mfs = (midi_file*)malloc(fn_total*sizeof(midi_file));
if(mfs==nullptr) {
    draw_error("too many files");
    while(1);
}
file = SD.open("/","r");
char* str = fns;
int fi = 0;
while(true) {
    File f = file.openNextFile();
    if(!f) {
        break;
    }
    if(!f.isDirectory()) {
        const char* fn = f.name();
        size_t fnl = strlen(fn);
        if((fnl>5 && ((0==strcmp(".midi",fn+fnl-5) ||
                    (0==strcmp(".MIDI",fn+fnl-5) ||
                    (0==strcmp(".Midi",fn+fnl-5))))))||
        (fnl>4 && (0==strcmp(".mid",fn+fnl-4) ||
                    0==strcmp(".MID",fn+fnl-4))  ||
                    0==strcmp(".Mid",fn+fnl-4))) {
            memcpy(str,fn,fnl+1);
            str+=fnl+1;
            file_stream ffs(f);
            midi_file::read(&ffs,&mfs[fi]);
            ++fi;
        }
    }
    f.close();
}
file.close();

Basically, for each MIDI file on the SD, we hold a midi_file and the filename, plus we guarantee at least one character prior to the filename is valid to write to, because before we load the file, we simply set that prepended location to '/' to make a path out of the filename. This avoids a string copy. It's minor, but it's more about not having to write as much code.

The reason we load the entire list into RAM is because your encoder finger is quite a bit faster than the SD reader is. We want the display to be responsive, so keeping the list in RAM facilitates that. The reason we loaded each file rather than just retrieving the name is so we can display MIDI type 1 files in blue and MIDI type 2 files in black, while also giving you the count of tracks in each one. We actually could use a smaller structure than midi_file here, and save a some bytes since we don't need all the information it contains.

Next, we close the root directory. After that, we check to see if it there is more than 1 file. If there is, we display the file selection screen, or otherwise we just load the single file.

The file selection screen is somewhat involved in practice, but easy to explain:

const char* seltext = "select filE";
float fscale = prangfnt.scale(80);
ssize16 tsz = prangfnt.measure_text(ssize16::max(),spoint16::zero(),seltext,fscale);
srect16 trc = tsz.bounds().center_horizontal((srect16)lcd.bounds());
draw::text(lcd,trc.offset(0,20),spoint16::zero(),seltext,prangfnt,
           fscale,color_t::red,color_t::white,false);
fscale = Telegrama_otf.scale(20);
bool done = false;
size_t fni=0;
int64_t ocount = encoder.getCount()/4;
int osw = digitalRead(P_SW1) || digitalRead(P_SW2) ||
          digitalRead(P_SW3) || digitalRead(P_SW4);

while(!done) {
    tsz= Telegrama_otf.measure_text(ssize16::max(),spoint16::zero(),curfn,fscale);
    trc = tsz.bounds().center_horizontal((srect16)lcd.bounds()).offset(0,110);
    draw::filled_rectangle(lcd,srect16(0,trc.y1,lcd.dimensions().width-1,
                           trc.y2+trc.height()+5),color_t::white);
    rgb_pixel<16> px=color_t::black;
    if(mfs[fni].type==1) {
        px=color_t::blue;
    } else if(mfs[fni].type!=2) {
        px=color_t::red;
    }
    draw::text(lcd,trc,spoint16::zero(),curfn,Telegrama_otf,
               fscale,px,color_t::white,false);
    char szt[64];
    sprintf(szt,"%d tracks",(int)mfs[fni].tracks_size);
    tsz= Telegrama_otf.measure_text(ssize16::max(),spoint16::zero(),szt,fscale);
    trc = tsz.bounds().center_horizontal((srect16)lcd.bounds()).offset(0,133);
    draw::text(lcd,trc,spoint16::zero(),szt,Telegrama_otf,fscale,
               color_t::black,color_t::white,false);
    bool inc;
    while(ocount==(encoder.getCount()/4)) {
        int sw = digitalRead(P_SW1) || digitalRead(P_SW2) ||
                 digitalRead(P_SW3) || digitalRead(P_SW4);
        if(osw!=sw && !sw) {
            // button was released
            done = true;
            break;
        }
        osw=sw;
        delay(1);
    }
    if(!done) {
        int64_t count = (encoder.getCount()/4);
        inc = ocount>count;
        ocount = count;
        if(inc) {
            if(fni<fn_count-1) {
                ++fni;
                curfn+=strlen(curfn)+1;
            }
        } else {
            if(fni>0) {
                --fni;
                curfn=fns;
                for(int j = 0;j<fni;++j) {
                    curfn+=strlen(curfn)+1;
                }
            }
        }
    }
}

We start by drawing "select file" and then, the main thing we're doing here is drawing the current file info, and waiting for the encoder to change or one of the buttons to be pressed.

In the main loop, we start by waiting for the encoder. In that inner loop, we also look for button presses. If the encoder changes, we determine the direction by comparing the current count with the old one. You'll note we divide by 4. This is so the encoder adjusts values when the knob falls from bump to groove. Otherwise, you'd get change signals several times over one "click" of the encoder.

In that curfn holds our current filename string, and fni holds the file name index. Going forward just involves advancing the pointer. Going backward involves starting from the beginning and then advancing the pointer until we get to where we were minus one. Since it's in RAM, it's all basically instant anyway.

Once we're finished with that, or if we had just a single file and bypassed it, we get to the next portion of our startup code. We initialize the output because the MIDI USB device takes a second to spin up. Then we reset our tempo multipler. I guess we didn't have to earlier, but it didn't hurt. We resample our old encoder value so the next time we read it, it doesn't register a false change. Finally, we prepend our '/' like we covered and then open the file, before freeing the file list data we made earlier.

// avoids the 1 second init delay later
out.initialize();
tempo_multiplier = 1.0;
encoder_old_count = encoder.getCount()/4;
--curfn;
*curfn='/';
file = SD.open(curfn, "rb");
if(!file) {
    draw_error("re-insert SD card");
    while(true) {
        SD.end();
        SD.begin(SD_CS,spi_container<0>::instance());
        file=SD.open(curfn,"rb");
        if(!file) {
            delay(1);
        } else {
            break;
        }
    }
}
::free(fns-1);
::free(mfs);

Now we draw the next screen - the playing screen, and we load the sampler with the file we just opened. If there's not enough memory we error and restart, although in practice, most MIDI files shouldn't have any problem being loaded, even without PSRAM:

draw::filled_rectangle(lcd,lcd.bounds(),color_t::white);
const char* playing_text = "pLay1nG";
float playing_scale = prangfnt.scale(125);
ssize16 playing_size = prangfnt.measure_text(ssize16::max(),spoint16::zero(),
                       playing_text,playing_scale);
draw::text(lcd,playing_size.bounds().center((srect16)lcd.bounds()),
           spoint16::zero(),playing_text,prangfnt,playing_scale,color_t::red,
           color_t::white,false);
free(prang_font_buffer);
file_stream fs(file);
sfx_result r=midi_sampler::read(&fs,&sampler);
if(r!=sfx_result::success) {
    switch(r) {
        case sfx_result::out_of_memory:
            file.close();
            draw_error("file too big");
            delay(3000);
            goto restart;
        default:
            file.close();
            draw_error("not a MIDI file");
            delay(3000);
            goto restart;
    }
}
file.close();

I'm debating about moving the above checks to when I read the files off the SD initially. It will take the splash screen longer, but I wouldn't mind that. The advantage is that the file list wouldn't contain MIDIs that couldn't be loaded. For a musician that's handy, because musicians often aren't tech savvy and it's better to make it hard to do the Wrong Thing(TM).

Now we set the sampler's output and then send our initial message to the display thread to update the tempo multiplier to its initial value. If we didn't do this, the tempo multiplier value wouldn't display until if or when it was changed.

sampler.output(&out);
xRingbufferSend(signal_queue,&tempo_multiplier,sizeof(tempo_multiplier),0);

Finally, we get to the main application loop. This is where the magic happens.

The first portion of the loop handles encoder changes, updating the tempo multiplier:

int64_t ec = encoder.getCount()/4;
if(ec!=encoder_old_count) {
    bool inc = ec<encoder_old_count;
    encoder_old_count=ec;
    if(inc && tempo_multiplier<=4.9) {
        tempo_multiplier+=.1;
        sampler.tempo_multiplier(tempo_multiplier);
        xRingbufferSend(signal_queue,&tempo_multiplier,sizeof(tempo_multiplier),0);
    } else if(tempo_multiplier>.1) {
        tempo_multiplier-=.1;
        sampler.tempo_multiplier(tempo_multiplier);
        xRingbufferSend(signal_queue,&tempo_multiplier,sizeof(tempo_multiplier),0);
    }
}

The latter portion of the loop is longer, but it's mostly repetitive. It handles the four buttons, and the logic is the same for each one, so it could be done in a loop, and eventually will be. I'll cover two of the four below, because the next two are the same:

bool first_track = follow_track == -1;
bool changed = false;
int b=digitalRead(P_SW1);
if(b!=switches[0]) {
    changed = true;
    if(b) {
        if(first_track) {
            follow_track = 0;
            sampler.start(0);
        } else {
            sampler.start(0,sampler.elapsed(follow_track) %
                          sampler.timebase(follow_track));
        }

    } else {
        sampler.stop(0);
    }
    switches[0]=b;
}
b=digitalRead(P_SW2);
if(b!=switches[1]) {
    changed = true;
    if(b) {
        if(first_track) {
            follow_track = 1;
            sampler.start(1);
        } else {
            sampler.start(1,sampler.elapsed(follow_track) %
                          sampler.timebase(follow_track));
        }

    } else {
        sampler.stop(1);
    }
    switches[1]=b;
}

This implements the quantization logic as well as starting and stopping the tracks as needed. The quantization currently only works for late presses. What it does is mod the timebase to find the next tick that falls on a beat, and advances by that amount when the loop starts. This may mean notes may get skipped, because unlike when you do this with a sampler, you can't coax a MIDI score to play a partial note. That's a limitation of the protocol with no known workaround, so there's nothing in my code to be done to fix it.

Conclusion

That's the meat of it. Now go forth wielding SFX and GFX to make your own musical MIDI gadgets!

History

  • 16th May, 2022 - Initial submission