When we set out to build the HermesLens, the idea was simple: take the Hermes Agent's pulse — active sessions, agent status, system health — and put it on a physical device you can glance at from across the room. An M5 StickS3 running a compact display, connected over WiFi, polling the backend every 10 seconds. What could possibly go wrong?
From MicroPython to C++
The first iteration was MicroPython. Clean, Pythonic, familiar. But the M5 StickS3's ESP32-S3 with Octal PSRAM was a moving target — the MicroPython build for that specific chipset had quirks. Memory allocation issues, display driver limitations, the PSRAM not being recognized. After enough dead ends, we made the call: rewrite the firmware in C++ with PlatformIO and the M5Unified + LovyanGFX libraries. It was the right decision, but it meant starting almost from scratch.
The PSRAM Nightmare
The M5 StickS3 uses Octal PSRAM. Getting this recognized by the toolchain was a battle. Errors like:
E (221) psram: PSRAM ID read error: 0x00ffffff, PSRAM chip not found or not supported, or wrong PSRAM line modeThis manifested as mysterious crashes — Guru Meditation errors, null pointer dereferences deep in the WiFi stack. The fix required the right platformio.ini configuration: board_build.psram_type=opi and board_build.arduino.memory_type=qio_opi. Once PSRAM was properly initialized, the stability improved dramatically.
LittleFS vs SPIFFS: The Corruption Saga
The firmware stores WiFi credentials and backend URL in a config.json file on the flash filesystem. We started with LittleFS — the modern choice. But every flash via M5Burner left stale data on the LittleFS partition. The library returned cryptic error codes:
E (1534) esp_littlefs: Corrupted dir pair at {0x0, 0x1}
LittleFS.cpp:98 begin(): Mounting LittleFS failed! Error: -1The workaround was brutal but effective: force LittleFS.format() on every boot. Then we discovered that ESP32's libesp_littlefs returns ESP_ERR_INVALID_ARG (258) — not ESP_FAIL — when the partition is corrupt, so the auto-format feature never triggers. The solution? Switch to SPIFFS. The SPIFFS.begin(true) call auto-formats seamlessly on corruption. One line change, hours of debugging to find it.
The Captive Portal Bug
The setup portal is a tiny HTTP server that runs on the M5 when no config is found. You connect to its WiFi, enter your credentials, and submit. Except nothing was saving. Every submit returned "Missing fields" — all three fields (SSID, password, backend URL) were empty.
The culprit? A Content-Length parsing bug. The body-read loop checked for \r\n\r\n (end of headers) and then read whatever bytes remained. But POST body data often arrives in a separate TCP segment. The fix was proper Content-Length parsing with a dedicated loop to read exactly that many body bytes. The offset was off by two characters — the \r\n before Content-Length: header.
The Display Flicker Fix
For a while, the display had a visible flicker on boot. M5.begin() was setting the backlight brightness, but then setRotation() on the ST7789 controller was resetting the BLCONTR register to its default. The fix: call setBrightness(80) after setRotation(). Three seconds of code change after hours of head-scratching.
Where We Are Now
The firmware compiles and runs. Flash usage sits at about 85% with ~200KB free. RAM is at 15%. The dashboard has 4 pages planned:
- Agents — Live agent roster and status
- Tasks — Current task queue and execution state
- System — WiFi signal, uptime, backend connectivity
- Usage — Session counts, token consumption
The diagnostic overlay is complete — it shows which config fields are empty right on the M5 screen, no serial cable needed. The button navigation (GPIO11/GPIO12 pressed) lets you cycle through pages. The portal JSON responses are structured now, so the browser can display proper error messages.
What's Still Broken
The backend connectivity remains the open issue. The M5 polls http://13.0.0.3:8123 every 10 seconds but gets no response. The pfSense firewall rule should allow traffic between the M5's VLAN and the backend server, but it's not working yet. Without that, the HermesLens displays a perpetual "Backend error" screen instead of live data.
The 4-page dashboard content is still stubs. When reachable, it shows "All systems OK" — the real agent detail overlay, WiFi status widget, and tap coordinate handling need more work.
What's Next
The path forward is clear: get the backend reachable, flesh out the dashboard pages with actual data, add a flash-wipe flag to the deployment script (--wipe to pre-fill clean config.json), and polish the boot sequence. The hardware works. The firmware compiles. The hard part — the architecture, the toolchain, the debugging — is done.
HermesLens will be the physical face of the Hermes system. A device you can glance at and know the system's health. We're close. Closer than the bug log suggests.