DESIGNING FOR THE IOT IDENTIFYING BUGS
www.newelectronics.co.uk 9 March 2021 17
Solveig/stock.adobe.com
Author details:
Johan Kraft is CEO
and Founder at
Percepio
value to the developer at the benchtop
is visual trace diagnostics. This
gives a visual timeline of the internal
software events during operation and
is a key tool for a developer tracking
down bugs. The trick is to bring this
data out during deployed operation
and provide the developer team with
detailed information about software
issues in the eld.
This is where the cloud integration
comes in. As more nodes are
connected in the IoT, that same
channel can also be used for
diagnostic data. The cloud connection
can be used to alert developers when
an error is rst detected and provide
visual trace diagnostics to identify the
root cause.
This is a powerful concept that not
only highlights when a node fails, but
it also provides an explanation – the
timeline of software events just before
the issue was detected, showing
what led up to the problem. This
way, the software can be corrected
and updated quickly, while only a few
customers have been affected by the
bug, and thereby avoid a much larger
problem.
This concept is the heart of a
tool called DevAlert. In combination
with the visual trace diagnostics
technology Tracealyzer and a
sophisticated cloud data management
system, it can be used to monitor any
kind of IoT connected device.
When something unexpected
happens, for example if the system
automatically rebooted to recover
from an error, the events that led to
the reboot are captured and available.
But it is the awareness that is the
key. When developers receive an
alert, they can immediately look at the
diagnostic trace information and see
exactly what happened.
By including a trace with the
alert, it becomes much easier to
identify the situation that caused the
problem. When you add visual trace
diagnostics, including many types of
visual overviews, it becomes even
easier for developers to understand
the problem. This allows developers
to nd the root cause and x the
problem quickly. The reaction time
matters. Most bugs in deployment
don’t show up directly for all users,
otherwise they would surely have been
found during the testing. So the faster
an update can be provided, the fewer
customers will be affected.
This is particularly helpful as the
customers won’t necessarily report a
problem and provide the information
that a developer needs to reproduce
and x the bugs, especially for
consumer devices. Even when a bug
is reported, it can be with very vague
information, leaving a developer at a
loss on where to start.
The concept involves three
software components. The DevAlert
Firmware Monitor (DFM) is a compact
software library that device developers
embed in their RTOS-based IoT
application. This agent keeps a
trace of recent software events and
provides a way for error-handling
code in the application to report any
condition of relevance to the device
developer, errors, proactive warnings
or other diagnostic information,
whether related to software or
hardware. The alert message is then
uploaded to the cloud account of
the device using an existing secure
connection, such as MQTT over
Transport Layer Security (TLS).
But the DFM monitor on its own
could just deliver a deluge of data.
The DevAlert cloud service takes
that data and looks at error codes
and any other symptoms and noti es
the developers in case of a new
unique issue if a new combination
of symptoms has happened. This
avoids the problems of duplicate
alerts ooding a developer. Percepio’s
Tracealyzer tool can then be used
by the developer back in the lab to
analyse the provided trace.
It is important to understand
that DevAlert in no way replaces
conventional testing; you need both,
just as most cars have both seat
belts and air bags. Good, systematic
testing typically removes 95% of
the bugs, as we stated above. Error
reporting in the eld will help you
catch those bugs that testing couldn’t
nd, the dif cult ones that only appear
under certain conditions. There is
often an astronomical number of
potential scenarios in the software,
that depends on the inputs, the
software timing, device settings and
other environmental factors (e.g.
the Wi-Fi connection). Any one of
these may have latent bugs, that
may cause the device to crash or
produce incorrect data. For example,
some device might fail if the Wi-Fi
connection receives multiple packets
within 5 milliseconds while it is busy
writing data to a ash memory.
Above: DevAlert can
be used to monitor
any kind of IoT
connected device
/www.newelectronics.co.uk
/stock.adobe.com