AI: HOW LOW
CAN YOU GO?
Embedded machine learning is driving new accelerator architectures but
there are ways to reduce its thirst for processor power. By Chris Edwards
Markets are subject to
fads and the embeddedcontrol
sector is far from
immune to them. In the 1990s, fuzzy
logic seemed to be the way forward
and microcontroller (MCU) vendors
scrambled to put support into their
offerings only to see it ame out.
Embedded machine learning
(ML) is seeing a far bigger feeding
frenzy as established MCU players
and AI-acceleration start-ups try to
demonstrate their commitment to the
idea, which mostly goes under the
banner of TinyML.
Daniel Situnayake, founding
TinyML engineer at software-tools
company Edge Impulse and coauthor
of a renowned book on the
technology, says the situation today
is very different to that of the 1990s.
“The exciting thing about
embedded ML is that machine
learning and deep learning are not
new, unproven technologies - they’ve
in fact been deployed successfully on
server-class computers for a relatively
long time, and are at the heart of a
ton of successful products. Embedded
ML is about applying a proven set of
technologies to a new context that will
enable many new applications that
were not previously possible.”
ABI Research predicts the market
for low-power AI-enabled MCUs and
and crush them down to run on very
low power processors.
“Because it’s open-source
software, we get not only to interact
with product teams inside Google
but also get a lot of requests from
product teams all over the world
who are trying to build interesting
products. And we often have to say:
no, that’s not possible yet. We get
to see, in aggregate, a lot of unmet
requirements,” says Warden.
The core issue is that deeplearning
models ported from the
server environment call for millions
or even billions of multiply-add (MAC)
functions to be performed in a short
space of time even for relatively
simply models. Linley Gwennap,
president of the Linley Group, says
relatively simple audio applications,
such as picking up words in speech
that can activate voice recognition,
calls for around 2 million MACs per
second. Video needs far more.
Silicon vendors have been able
to push the MAC count by taking
advantage of the relatively low
requirement for accuracy in individual
calculations when performing
inferencing. Whereas training on
servers generally demands single
or double-precision oating point
arithmetic, byte-wide integer (int8)
calculations seem to be suf cient for
most applications.
There are indications that for
selected layers in a model, even
int8 MACs are unnecessary. Binary
accelerators for the TinyML market
will climb from less than $30m in
annual revenues this year to more
than $2bn by the start of the next
decade.
Despite the rapid growth, ABI
principal analyst Lian Jye Su expects
competition to become ercer as
large companies such as Bosch enter
the market. Already, some start-ups
such as Eta Compute have moved
away from silicon to software tools.
“We do see some consolidation.
At the same time, the huge
fragmentation in the IoT market
means a signi cant number of
providers will survive, like the MCU
or IoT chipset markets in general,”
he says, pointing to the large number
of suppliers who focus on speci c
vertical markets.
TinyML faces severe constraints.
Pete Warden, technical lead of
the TensorFlow Micro framework
at the search-engine giant and
Situnayake’s co-author on “TinyML:
Machine Learning with TensorFlow
Lite on Arduino and Ultra-Low-Power
Microcontrollers”, said at the Linley
Group’s Fall Processor Conference
that the aim is to take deep-learning
models and “get them running on
devices that have as little as 20KB of
RAM. We want to take models built
using this cutting-edge technology
14 23 February 2021 www.newelectronics.co.uk
/www.newelectronics.co.uk