Chris is an electrical engineer, he supports the software developers with the hardware part of the story.
The Pengutronix lab environment consists of DUTs, power supply switches, serial ports (network connected), Ethernet switches, and GPIO switches. There is also a central CAN bus to which devices can be attached. Similar for USB. A test can span over multiple devices. There’s a test server in the same rack as the devices. There’s also a WiFi access point and a Bluetooth device. There are also USB devices.
There are lots of USB devices in the lab infrastructure. The USB interfaces of the DUTs sometimes misbehave. But also the hubs don’t always behave properly. Because there are so many USB devices connected to the server, there are sometimes not enough USB addresses on the bus.
EMC becomes a problem because lab is too compact. There are ground loops between the USB, Ethernet shield, serial, 1-wire, CAN, … buses. Also capacitative over the power supplies. Thus, sometimes problems are triggered by switching big loads - maybe some of the USB problems are due to that.
The lab uses 1-wire for a lot of switching because it is compact. There are two problems with that. It is accessed over USB so if USB breaks down, so does 1-wire. Also the owfs server sometimes seem to loose devices. It’s also difficult to debug.
There will be more automated testing in the future, but also more interactive work on the central test setup. Therefore, the reliability needs to increase. Some ideas to get there:
For distinguishing between a hardware failure and a test failure, there should be some known-good situation that is checked in the test infra. LAVA and labgrid have something like this, but it doesn’t cover everything, e.g. a flaky USB device.
From the audience, there were some idea of what should be in this test hardware.