Updated on 2022-12-13
Control the sound from your Sonos system in each room of your house

I have a friend who lives on the other side of the world, and not being a coder primarily, he needed some help with the software end of a remote controller for his Sonos speaker system. For the hardware, I recommended he use a TTGO T-Display v1 since they are relatively cheap and had all the necessary hardware integrated into a single device, no additional wiring required, and can run on LIPO batteries.
Sonos speaker system TTGO T-Display v1 Disclaimer: I should warn you that I do not have a Sonos speaker system. With my friend's help, I have tested this code on his setup and he assures me it works. Since I do not have such a system, I cannot help you, gentle reader, should you need to troubleshoot that end of this project.
Update: Cleaned up the code. Code uses my new extended button. Code double buffers for no flicker.
Update 2: Cleaned up the code further. Code uses my new-new extended button. Code now has previous track function.
Update 3: Code uses updated dimmer library. Framebuffer is now partial, only rewriting the bottom of the screen.
Update 4: Now api.txt's first URL is for long click, and each URL following is for successive numbers of short clicks. Also the URLs are URL encoded now.
Update 5: Optimized the code so it reduces the SPI traffic to save battery life. Now only connects when needed and not on startup. Made font and text size selection more modular.
Update 5: Fixed connection code.
You'll need VS Code with PlatformIO installed.
You'll need a Sonos speaker system.
You need to install and configure the Node Sonos HTTP API on your network.
You need a TTGO T-Display v1.
Once you download the project, you'll need to edit a few files under the /data folder.
/%s and anything that follows.After that, you'll want to choose Upload Filesystem Image under PIO|Project Tasks|Platform.
Once that's complete, you can upload the project and it will connect automatically.
In terms of orienting the thing, the buttons should be to the right of the display but the text makes it obvious.
The top button changes the current speakers/room you are in.
Clicking the bottom button toggles between play and pause for that room/speaker. Pressing it for a longer period skips to the next track. Clicking two times rapidly will go back a track.
After a time, the thing will dim and then sleep to save battery. Press the top button to wake it up.
I've heavily commented this source code, but we'll go over it some here. This code makes significant use of my IoT ecosystem, including my graphics library, my button library, and my backlight management library.
my graphics library my button library backlight management library
I should note that both logo.hpp (a JPG) and SonosFont.hpp were generated using my browser based header code generator here. It's pretty self explanatory for the most part. Just drag a file on it and generate, and it will produce an object of the appropriate type - either a font or a stream suitable for draw::text<>() or draw::image<>(). Note that if you draw an image this way, you will have to seek the stream back to zero if you want to draw it again. In order to facilitate loading from HTTP and similar, that function does not seek the stream.
The meat of this, including all of the core logic is in main.cpp:
#include <Arduino.h>
#include <config.h>
#include <gfx.hpp>
#include <htcw_button.hpp>
#include <st7789.hpp>
#include <tft_io.hpp>
#include <lcd_miser.hpp>
#include <fonts/SonosFont.hpp>
#include <logo.hpp>
#include <SPIFFS.h>
#include <WiFi.h>
#include <HTTPClient.h>
using namespace arduino;
using namespace gfx;
// configure the display
using bus_t = tft_spi_ex<LCD_HOST, 
                        PIN_NUM_CS, 
                        PIN_NUM_MOSI, 
                        PIN_NUM_MISO, 
                        PIN_NUM_CLK, 
                        SPI_MODE0,
                        true,
                        LCD_WIDTH*LCD_HEIGHT*2+8,2>;
using display_t = st7789<LCD_WIDTH,
                        LCD_HEIGHT, 
                        PIN_NUM_DC, 
                        PIN_NUM_RST, 
                        -1 /* PIN_NUM_BCKL */, 
                        bus_t, 
                        1, 
                        true, 
                        400, 
                        200>;
using color_t = color<typename display_t::pixel_type>;
// background color for the display (24 bit, followed by display's native pixel type)
constexpr static const rgb_pixel<24> bg_color_24(/*R*/12,/*G*/12,/*B*/12);
constexpr static const display_t::pixel_type bg_color = convert<rgb_pixel<24>,display_t::pixel_type>(bg_color_24);
static display_t dsp;
// configure the buttons
using button_1_t = button_ex<PIN_BUTTON_1,
                        10, 
                        true>;
using button_2_t = button_ex<PIN_BUTTON_2,
                        10, 
                        true>;
static button_1_t button_1;
static button_2_t button_2;
// configure the backlight manager
static lcd_miser<PIN_NUM_BCKL> dimmer;
// function prototypes
static void ensure_connected();
static void draw_room(int index);
static const char* room_for_index(int index);
static const char* string_for_index(const char* strings,int index);
static void do_request(int index,const char* url_fmt);
// font
static const open_font& speaker_font = SonosFont;
static const uint16_t speaker_font_height = 35;
// global state
static HTTPClient http;
// current speaker/room
static int speaker_index = 0;
// number of speakers/rooms
static int speaker_count = 0;
// series of concatted null 
// termed strings for speakers/rooms
static char* speaker_strings = nullptr;
// how many urls are in api txt
static int format_url_count = 0;
// the format string urls
static char* format_urls = nullptr;
// temp for formatting urls
static char url[1024];
static char url_encoded[1024];
// the Wifi SSID
static char wifi_ssid[256];
// the Wifi password
static char wifi_pass[256];
// temp for using a file
static File file;
// begin fade timestamp
static uint32_t fade_ts=0;
// rather than draw directly to the display, we draw
// to a bitmap, and then draw that to the display
// for less flicker. Here we create the bitmap
using frame_buffer_t = bitmap<typename display_t::pixel_type>;
// reversed due to LCD orientation:
constexpr static const size16 frame_buffer_size({LCD_HEIGHT,speaker_font_height});
static uint8_t frame_buffer_data[frame_buffer_t::sizeof_buffer(frame_buffer_size)];
static frame_buffer_t frame_buffer(frame_buffer_size,frame_buffer_data);
static void button_1_on_click(int clicks,void* state) {
    // if we're dimming/dimmed we don't want 
    // to actually increment
    if(!dimmer.dimmed()) {
        // move to the next speaker
        speaker_index+=clicks;
        while(speaker_index>=speaker_count) {
            // wrap around
            speaker_index -= speaker_count;
        }
        // redraw
        draw_room(speaker_index);
    }
    // reset the dimmer
    dimmer.wake();
}
static void button_2_on_click(int clicks,void* state) {
    if(clicks<format_url_count) {
        const char* fmt_url = string_for_index(format_urls, clicks);
        if(fmt_url!=nullptr) {
            do_request(speaker_index, fmt_url);
        }
    }
    // reset the dimmer
    dimmer.wake();
}
static void button_2_on_long_click(void* state) {
    // play the first URL
    if(format_urls!=nullptr) {
        do_request(speaker_index,format_urls);
    }
    // reset the dimmer
    dimmer.wake();
}
static char *url_encode(const char *str, char *enc){
    for (; *str; str++){
        int i = *str;
        if(isalnum(i)|| i == '~' || i == '-' || i == '.' || i == '_') {
            *enc=*str;
        } else {
            sprintf( enc, "%%%02X", *str);
        }
        while (*++enc);
    }
    return( enc);
}
static void do_request(int index, const char* url_fmt) {
    const char* room = string_for_index(speaker_strings, index);
    url_encode(room,url_encoded);
    snprintf(url,1024,url_fmt,url_encoded);
    // connect if necessary
    ensure_connected();
    // send the command
    Serial.print("Sending ");
    Serial.println(url);
    http.begin(url);
    http.GET();
    http.end();
}
static void ensure_connected() {
    // if not connected, reconnect
    if(WiFi.status()!=WL_CONNECTED) {
        Serial.printf("Connecting to %s...\n",wifi_ssid);
        WiFi.begin(wifi_ssid,wifi_pass);
        while(WiFi.status()!=WL_CONNECTED) {
            delay(10);
        }
        Serial.println("Connected.");
    }
}
static void draw_center_text(const char* text) {
    // set up the font
    open_text_info oti;
    oti.font = &speaker_font;
    oti.text = text;
    // 35 pixel high font
    oti.scale = oti.font->scale(speaker_font_height);
    // center the text
    ssize16 text_size = oti.font->measure_text(
        ssize16::max(),
        spoint16::zero(),
        oti.text,
        oti.scale);
    srect16 text_rect = text_size.bounds();
    text_rect.center_horizontal_inplace((srect16)frame_buffer.bounds());
    draw::text(frame_buffer,text_rect,oti,color_t::white,bg_color);
}
static const char* string_for_index(const char* strings,int index) {
    if(strings==nullptr) {
        return nullptr;
    }
    // move through the string list 
    // a string at a time until the
    // index is hit, and return
    // the pointer when it is
    const char* sz = strings;
    for(int i = 0;i<index;++i) {
        sz = sz+strlen(sz)+1;
    }
    return sz;
}
static void draw_room(int index) {
    draw::wait_all_async(dsp);
    // clear the frame buffer
    frame_buffer.fill(frame_buffer.bounds(), bg_color);
    // get the room string
    const char* sz = string_for_index(speaker_strings, index);
    // and draw it. Note we are only drawing the text region
    draw_center_text(sz);
    srect16 bmp_rect(0,0,frame_buffer.dimensions().width-1,speaker_font_height-1);
    bmp_rect.center_vertical_inplace((srect16)dsp.bounds());
    bmp_rect.offset_inplace(0,23);
    draw::bitmap_async(dsp,bmp_rect,frame_buffer,frame_buffer.bounds());
}
void setup() {
    char *sz = (char*)malloc(0);
    sz = strchr("",1);
    // start everything up
    Serial.begin(115200);
    SPIFFS.begin();
    dimmer.initialize();
    button_1.initialize();
    button_2.initialize();
    // set the button callbacks
    button_1.on_click(button_1_on_click);
    button_2.on_click(button_2_on_click);
    button_2.on_long_click(button_2_on_long_click);
    // parse speakers.csv into speaker_strings
    file = SPIFFS.open("/speakers.csv");
    String s = file.readStringUntil(',');
    size_t size = 0;
    while(!s.isEmpty()) {
        if(speaker_strings==nullptr) {
            speaker_strings = (char*)malloc(s.length()+1);
            if(speaker_strings==nullptr) {
                Serial.println("Out of memory loading speakers (malloc)");
                while(true);
            }
        } else {
            speaker_strings = (char*)realloc(
                speaker_strings, 
                size+s.length()+1);
            if(speaker_strings==nullptr) {
                Serial.println("Out of memory loading speakers");
                while(true);
            }
        }
        strcpy(speaker_strings+size,s.c_str());
        size+=s.length()+1;
        s = file.readStringUntil(',');
        ++speaker_count;
    }
    file.close();
    // parse api.txt into our url format strings
    size = 0;
    file = SPIFFS.open("/api.txt");
    s=file.readStringUntil('\n');
    s.trim();
    while(!s.isEmpty()) {
        if(format_urls==nullptr) {
            format_urls = (char*)malloc(s.length()+1);
            if(format_urls==nullptr) {
                Serial.println("Out of memory loading API urls (malloc)");
                while(true);
            }
        } else {
            format_urls = (char*)realloc(
                format_urls, 
                size+s.length()+1);
            if(format_urls==nullptr) {
                Serial.println("Out of memory loading API urls");
                while(true);
            }
        }
        ++format_url_count;
        strcpy(format_urls+size,s.c_str());
        size+=s.length()+1;
        s = file.readStringUntil('\n');
        s.trim();
    }
    file.close();
    // parse wifi.txt
    file = SPIFFS.open("/wifi.txt");
    s = file.readStringUntil('\n');
    s.trim();
    strcpy(wifi_ssid,s.c_str());
    s = file.readStringUntil('\n');
    s.trim();
    strcpy(wifi_pass,s.c_str());
    file.close();
    // when we sleep we store the last room
    // so we can boot with it. it's written
    // to a /state file so we see if it exists
    // and if so, set the speaker_index to the
    // contents
    if(SPIFFS.exists("/state")) {
        file = SPIFFS.open("/state","rb");
        file.read(
            (uint8_t*)&speaker_index,
            sizeof(speaker_index));
        file.close();
        // in case /state is stale relative to speakers.csv:
        if(speaker_index>=speaker_count) {
            speaker_index = 0;
        }
    }
    // initial connect
    ensure_connected();
    // draw logo to screen
    draw::image(dsp,dsp.bounds(),&logo);
    // clear the remainder
    // split the remaining rect by the 
    // rect of the text area, and fill those
    rect16 scrr = dsp.bounds().offset(0,47).crop(dsp.bounds());
    rect16 tr(scrr.x1,0,scrr.x2,speaker_font_height-1);
    tr.center_vertical_inplace(dsp.bounds());
    tr.offset_inplace(0,23);
    rect16 outr[4];
    size_t rc = scrr.split(tr,4,outr);
    // we're only drawing part of the screen
    // we don't draw later
    for(int i = 0;i<rc;++i) {
        draw::filled_rectangle(dsp,outr[i],bg_color);
    }
    
    // initial draw
    draw_room(speaker_index);
}
void loop() {
    // pump all our objects
    dimmer.update();
    button_1.update();
    button_2.update();
    // if we're faded all the way, sleep
    if(dimmer.faded()) {
        // write the state
        file = SPIFFS.open("/state","wb",true);
        file.seek(0);
        file.write((uint8_t*)&speaker_index,sizeof(speaker_index));
        file.close();
        dsp.sleep();
        // make sure we can wake up on button_1
        esp_sleep_enable_ext0_wakeup((gpio_num_t)button_1_t::pin,0);
        // go to sleep
        esp_deep_sleep_start();
        
    } 
}The basic phases are:
There is some funky business going on with the drawing itself. The thing is I didn't want to write the same pixel twice, in order to save battery life by reducing SPI communication. To that end, I've created a small frame buffer that is just the width of the screen, and the height of the font. That is the only part of the display that changes. Other than that, on setup() we load the JPG, and draw every bit of the background except the area where the text will be drawn. That's what the split() nonsense does. We use it to punch a rectangle shaped hole in the region below the JPG. That rectangle is the size of the frame buffer - the dynamic part of the screen. We then only fill the rectangles it yielded above and below the frame buffer. That's what that loop after split() does.
That leaves the button handling code. These construct web requests to send to the HTTP API in order to run commands. On long click of the second button, we send URL 1 in api.txt (skip track), or if it's one or more short clicks, we skip to the next track or go to the previous track as indicated by the second and third api.txt entries.
Note that we never deallocate. It's not necessary because we don't shut down in IoT - we power off, so a lot of the code you'd find in traditional apps to tear things down is often not necessary when coding for these little platforms, just because there's no operating system to drop back to, and so the program basically never ends.