How difficult can it be to monitor all your IoT devices and take care of any software or firmware errors before the end user even knows that something is wrong? Turns out, it takes no time at all to get observability set up for Linux. And today, Memfault shows us how it’s done!
In part 3 of ipXchange’s interview series with Memfault, Eamon chats with software engineer Pat Wolfe for a live tutorial of setting up Memfault’s observability platform to work with a Linux operating system. As a Linux solutions architect, Pat is the perfect person to take us through the process of:
- Installing Memfault on a Raspberry Pi with the Memfault Quickstart installer
- Collecting system and application metrics
- Collecting logs which contain the same output as running journalctl on the device
- Forcing a crash to demonstrate coredump capture, processing, and analysis
But first, let’s have a little bit of a recap…
What is observability?
As Pat reiterates, observability is the ability to monitor the behaviour and metrics of your devices while they are out in the field. It enables software developers to keep track of errors across many devices and debug these errors without the end user having to send the device back for testing, or even having to self-report issues.
Memfault has brought observability to the embedded Linux space with a best-in-class platform that allows engineers to monitor their devices in great detail. All that is required is to integrate the Memfault SDK as part of a device’s firmware or software. This is what Pat demonstrates in the tutorial. These telemetry data packets are also very small, meaning that resource-constrained devices can still produce accurate reports. The processing and memory requirements are minimal.
Memfault allows for remote monitoring, debugging, and OTA updates in MCU-, Linux-, and Android-based systems. This time, we are obviously focussing on Linux.
The difference between Memfault observability for MCUs and Linux
Pat explains that at the highest level, Memfault works the same way for providing observability for MCU and Linux embedded architectures. That said, Linux features much stronger process abstractions than bare-metal or RTOS (Real-Time Operating System) environments. This means that Memfault observability for Linux focusses on processes rather than the full system when it comes to crash insights.
That said the same three pillars of monitoring, debugging, and OTA updates still stand with Linux, and Memfault still produces key overall system metrics such as CPU utilisation and memory usage. Similarly, these insights scale for statistical analysis of performance across an entire fleet of devices.
Reducing resource requirements for IoT telemetry data
As Pat shows us, Memfault’s SDK consumes a negligible amount of processing resources while providing observability for your Linux build. This means that the end user won’t even know it’s running, and it won’t affect a device’s intended operating capabilities.
For large-scale errors resulting in a core dump, Memfault’s software is very clever at managing what is worthwhile to report back to the cloud for the purpose of debugging. Rest assured, you will not burn your cellular data usage when using Memfault for observability in your Linux-based IoT.
Installing Memfault observability on a Linux system
While we could write out the instructions in detail, it is far simpler to watch the tutorial. This starts at around 5 minutes into the conversation, and Pat makes it look very easy!
For reference, Pat is using a Raspberry Pi 4B running Raspbian, and Pat’s laptop is remotely connected to it via the SSH (Secure Shell) protocol. If you are running a Yocto distribution, more information can be found on the Memfault GitHub so that you can get the recipe for integrating observability with this type of Linux distribution. Memfault’s SDK for Linux is also known as ‘memfaultd’.
In addition to the points outlined above, Eamon and Pat discuss:
- The frequency of telemetry data reporting
- Red flags when it comes to processing and memory usage
- Running statsd for creating datagrams
- Forced Memfault synchronisation
- The next steps after a crash occurs in the field
- Dividing the fleet into cohorts for partial OTA rollout
- Fleet metrics, new performance issues, and error grouping
Getting started with Memfault
As we’ve shown, you can get observability data within just a few minutes of integrating the Memfault SDK on a device. Pat recommends signing up and booking a demo through Memfault’s website, but you can also try Memfault’s Sandbox demo environment if you want to play with the platform but don’t have the hardware ready to go.
If you’re an MCU firmware developer, you’ll likely want to check out part 1 of this video series, where Gillian takes Eamon through the MCU version of the Memfault platform. This demo features a large fleet of devices, so it’s great viewing for seeing Memfault used at scale.
And we can’t forget part 2, where Eamon interviewed Memfault CEO & Founder François Baldassari on the concept of “Build vs. Buy vs. Blend” and why choosing Memfault will save you time and money when it comes to integrating observability.
We hope you enjoy(ed) this tutorial, and as always…
Keep designing!