Overview
Ask a firmware team whether hardware-in-the-loop testing matters and you'll get a yes. Then ask who keeps the rig running. It's usually one engineer, doing it in the gaps between feature deadlines, and the suite drifts. Coverage flatlines. A few months on, the dashboard reads “mostly green, except for the tests we switched off,” which is a polite way of saying nobody trusts it anymore. Embedder runs the whole loop instead: it generates the HIL test cases, runs them on the instruments already on your bench, and patches the firmware when a test catches a regression, then re-runs to prove the fix on real silicon.
The three pillars of closed-loop HIL
- Generate. Test cases drawn from the reference manual, the schematic, and the code you already have.
- Run. Those cases execute on real instruments: debug probes, logic analyzers, power profilers, bench gear.
- Patch. A failure kicks off an autonomous firmware fix that gets re-flashed and re-verified, with nobody at the keyboard.
1. Generating test cases
A HIL test usually costs more effort than the feature it's meant to guard. So teams write one when something forces their hand, a certification, a customer, an incident nobody wants to relive, and then the next ten features ship bare because the deadline won't move. The only real fix is to make the test nearly free to write. Embedder generates the suite from the same documents it reads to write the firmware in the first place:
- Test scaffolds: per peripheral and per feature, parameterized over the actual hardware and pulled from the reference manual.
- Fixtures: power state, init sequence, and the pre- and post-conditions a test expects to hold.
- Regression sweeps that fire whenever a change touches a driver, re-running everything that exercises it.
- Bus-level checks. Is the PWM pair really showing up at the output pin? That gets verified against a captured signal, not assumed from the driver code.
- Coverage reports broken out by peripheral, feature, and condition, with the traceability regulated programs ask for.
Every test is really a claim about how the hardware should behave. Proving it takes real equipment.
2. Running them on real test equipment
Run a test in a simulator and it stays blind to the analog front-end, the noisy supply on Rev B, the timing skew on an actual I²C bus. Embedder drives the gear already sitting on your bench, more than 30 supported test-equipment integrations, through the same Hardware Interaction layer that everything else in the agent runs on:
- Debug probes (GDB): J-Link, ST-Link, and OpenOCD for flashing, running, and reading back target state.
- Logic analyzers: Saleae and Digilent, with native data ingestion so the bus-level checks have something to check against.
- Power profilers: Nordic PPK and Joulescope for current draw and sleep-state assertions.
- Bench equipment: Siglent and Rigol oscilloscopes and programmable power supplies.
- The integrated serial terminal, where UART boot logs, command responses, and hard-fault dumps stream back as test evidence.
- Custom rigs: extensible adapters for proprietary fixtures, and for anything without a native API, plain ingestion of exported logs, traces, and measurement files.
Concurrency is where most benches fall apart. Hardware Interaction arbitrates access so two runs don't end up fighting over the same JTAG line, it puts destructive operations like a flash erase or a fuse blow behind an explicit confirmation, and it rate-limits I/O so a runaway loop can't cook a part.
3. Applying patches autonomously
A green suite tells you nothing you didn't already hope for. The failing one is where the work actually pays off. When a test fails, Embedder won't just paint a red square and walk off; it lines the failure up against the reference manual, the debugger state, the serial output, and the captured waveform, writes a firmware patch, re-flashes the board, and re-runs the failing test along with its regression sweep to confirm the fix on silicon.
- Diagnose: the failed assertion and the live signals together pin the fault to a driver, a register, or a bad timing assumption.
- Patch: the edit is grounded in the documentation rather than a hunch.
- Re-verify: flash, run, watch the hardware, and keep going until the test passes and nothing nearby breaks.
- Review: a person approves the diff. The loop runs on its own; the merge stays human.
SIL alongside HIL
One test definition, two places to run it. The simulator gives fast feedback on a PR; the real board handles merge gates and the nightly run. No more babysitting two suites that quietly drift apart.
CI integration
The same CLI your engineers run locally also runs inside your CI runner. A setup we see a lot:
- PR open: SIL runs, every test, done in a couple of minutes.
- PR ready for review: HIL runs on a board-farm unit and posts results, plus any autonomous patch, back to the PR.
- Merge: a full regression sweep on hardware before the button unlocks.
- Nightly: the long stress and soak tests, spread across the board farm.
Who it's for
- Hardware teams shipping firmware to customers, where every escape costs more than the test ever would have.
- Regulated programs under ISO 26262 ASIL, IEC 62304, or DO-178C that owe documented test evidence on every release.
- Teams with a HIL rig already who can't get coverage to keep growing.
- Silicon vendors checking that generated drivers behave at the bus level.
Getting started
Point us at a probe you already use and a board you already have. We'll run the generate-run-patch loop on a single peripheral, start to finish, so you can watch it work. Talk to an engineer.