Shared Publication

AI: HOW LOW CAN YOU GO? Embedded machine learning is driving new accelerator architectures but there are ways to reduce its thirst for processor power. By Chris Edwards Markets are subject to fads and the embeddedcontrol sector is far from immune to them. In the 1990s, fuzzy logic seemed to be the way forward and microcontroller (MCU) vendors scrambled to put support into their offerings only to see it ame out. Embedded machine learning (ML) is seeing a far bigger feeding frenzy as established MCU players and AI-acceleration start-ups try to demonstrate their commitment to the idea, which mostly goes under the banner of TinyML. Daniel Situnayake, founding TinyML engineer at software-tools company Edge Impulse and coauthor of a renowned book on the technology, says the situation today is very different to that of the 1990s. “The exciting thing about embedded ML is that machine learning and deep learning are not new, unproven technologies - they’ve in fact been deployed successfully on server-class computers for a relatively long time, and are at the heart of a ton of successful products. Embedded ML is about applying a proven set of technologies to a new context that will enable many new applications that were not previously possible.” ABI Research predicts the market for low-power AI-enabled MCUs and and crush them down to run on very low power processors. “Because it’s open-source software, we get not only to interact with product teams inside Google but also get a lot of requests from product teams all over the world who are trying to build interesting products. And we often have to say: no, that’s not possible yet. We get to see, in aggregate, a lot of unmet requirements,” says Warden. The core issue is that deeplearning models ported from the server environment call for millions or even billions of multiply-add (MAC) functions to be performed in a short space of time even for relatively simply models. Linley Gwennap, president of the Linley Group, says relatively simple audio applications, such as picking up words in speech that can activate voice recognition, calls for around 2 million MACs per second. Video needs far more. Silicon vendors have been able to push the MAC count by taking advantage of the relatively low requirement for accuracy in individual calculations when performing inferencing. Whereas training on servers generally demands single or double-precision oating point arithmetic, byte-wide integer (int8) calculations seem to be suf cient for most applications. There are indications that for selected layers in a model, even int8 MACs are unnecessary. Binary accelerators for the TinyML market will climb from less than $30m in annual revenues this year to more than $2bn by the start of the next decade. Despite the rapid growth, ABI principal analyst Lian Jye Su expects competition to become ercer as large companies such as Bosch enter the market. Already, some start-ups such as Eta Compute have moved away from silicon to software tools. “We do see some consolidation. At the same time, the huge fragmentation in the IoT market means a signi cant number of providers will survive, like the MCU or IoT chipset markets in general,” he says, pointing to the large number of suppliers who focus on speci c vertical markets. TinyML faces severe constraints. Pete Warden, technical lead of the TensorFlow Micro framework at the search-engine giant and Situnayake’s co-author on “TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers”, said at the Linley Group’s Fall Processor Conference that the aim is to take deep-learning models and “get them running on devices that have as little as 20KB of RAM. We want to take models built using this cutting-edge technology 14 23 February 2021 www.newelectronics.co.uk /www.newelectronics.co.uk