AUTOMATION JULY/AUGUST 2020
TO TRAIN YOUR
MACHINE, REWARD IT zapp2photo /stock.adobe.com
Can you train a robot in the same way you would train a pet? The answer may surprise you...
www.manufacturingmanagement.co.uk
32
BY NEIL BALLINGER, HEAD OF EMEA SALES, EU AUTOMATION
What do cobots, dogs and dolphins
have in common? The fact
that they can all be trained by
rewarding desired behaviours
while ignoring undesired ones.
For animals, this training
technique is called positive reinforcement.
For machines, it is known as reinforcement
learning, and falls under the broader umbrella
of machine learning.
Reinforcement learning is a form of machine
learning where a computer learns to complete
a task by having repeated interaction with a
dynamic environment. Through an iterative
trial-and-error approach, the machine explores
the environment. This exploration generates
data, which is used by the machine to determine
the best course of action to complete its job.
This happens without human intervention and
without having to programme the machine to
perform a specific task.
Reinforcement learning differs from
supervised machine learning in that in the
latter, algorithms are built using data sets
that contain the correct answer to a given
problem. In reinforcement learning there is
no answer – the machine has to find one by
trying different courses of action and
eventually selecting the one that gives the
most reward with the least effort.
We could say that in the absence of answers,
the machine learns through its own experience.
The component that makes the decision of
which action to take is known as ‘agent’.
How it works
Imagine that a dog in garden is given a tennis
ball. The dog, which represents the agent,
will first observe the garden and construct its
representation of the environment. It will then
wonder – what can I do with this ball? What
happens if I throw it? Can I
hide it? If so, where?
It will choose a course
of action, such as hiding the
ball, and observe how the
owner responds. If the owner
simply stares at the dog and
doesn’t interact, the dog will
find this dull, receiving a
negative reward.
The dog will repeat the
process until it realises that
bringing the ball back to
the owner will result in a
smile and a treat, that is a
positive reward. It will then
understand that this action
is the best one to maximise
its rewards.
Reinforcement learning
algorithms encourage a
machine to act in a similar
way, interacting with a
dynamic environment – for
example a factory floor with
several production lines
– until it finds the most
convenient way of proceeding.
Applications in
manufacturing
In industrial manufacturing,
reinforcement learning is
used in processes where
complex decision-making
skills are required, especially
where machines need to cope
with changes in dynamic
environments.
For example, a cobot can be
trained to find the best path
to avoid interferences, such as
objects or the limbs of human
workers, while continuing to
perform its task. This would
be simple for a human, but for
machines it is an incredibly
complex process that requires
a careful analysis of an
unpredictable environment.
If successful, the cobot
will be more productive,
because it won’t need to stop
to avoid impact.
Streamlining operations
Reinforcement learning can
also be used to streamline
production, an approach
used by researchers at the
Industrial AI Lab at Hitachi
America. The researchers
designed a virtual shop floor
as a bidimensional matrix and
used reinforcement learning
algorithms to repeatedly
interact with this virtual
environment. By doing this,
they were able to determine
the best set up to increase
productivity and reduce delays
in servicing their customers.
Applications of
reinforcement learning
in manufacturing are just
emerging, but the first
experiments are already
offering promising results.
Industrial machines
work hard to increase your
productivity. It’s time to
reward them.
/stock.adobe.com
/www.manufacturingmanagement.co.uk