data acquisition systems
“They are virtually the engine around
which the entire concept is built.”
NEW STREAM PROCESSING
PLATFORM
Apache Kafka is a messaging system that
enables the data received from instruments
to be queued and made highly available to
follow-on systems. Kafka looks at every
single received value and analyses it for
current measures, “stream processing”.
However, in order to capture, store and
make accessible to follow-on systems, the
enormous volumes of data requires a
database that offers the appropriate
performance and interfaces that allow fast
and convenient access.
The CrateDB is a new kind of distributed
SQL database that improves the handling
of time series analysis. The use of SQL as a
query language simplifies application and
integration, and NoSQL base technology
allows you to process IoT data in a variety
of formats. The CrateDB can hold hundreds
of terabytes of data and, thanks to the
shared-nothing architecture within server
clusters, guarantees real-time availability
without data loss or downtime.
Condition-based inspection of the test
specimen, instead of interval-based
inspection, is a potential solution to reduce
the total fatigue test duration and to
quickly detect abnormalities. One of the
implications is that more sensors are
required to monitor the behavior of the
test specimen and to detect or predict
structural failures. As a full-scale fatigue
test can generate data at rates of up to
10MB/s, totalling to hundreds of terabytes
at completion, data processing and
analysis have become a major bottleneck.
In order to capture, analyse, and store
the rather enormous volume of data, and
to ensure its available for applications,
Gantner turned to a combination of
Apache Kafka (data streaming) and
CrateDB (distributed NoSQL database built
for IoT/industrial use cases). CrateDB is
used for real-time hot storage and Kafka
for cost-effective, document-based storage.
“After extensive research and
comparisons, we decided to use the
combination of Apache Kafka and CrateDB
for the design of the data backend,”
explains Jürgen Sutterlüti, head of cloud
and data analytics at Gantner Instruments.
AEROSPACETESTINGINTERNATIONAL.COM // SHOWCASE 121
infrastructure in the test lab, whilst
maintaining the necessary computing
performance for test-critical data
analysis tasks.
Aircraft engine testing is a typical use
case where a scalable data backend offers
major advantages. Engine testing
generates a lot of data, especially when
engine transient responses must be
recorded. Data rates can vary from 10
samples/second up to 100,000
samples/second. The challenge is to store
massive amounts of sensor data, keep it
available on a 24/7 basis and allow rapid
data analysis.
Another example is where a scalable
data backend proves its advantages is
fatigue testing of large components or fullscale
structures. A typical fatigue test
program is divided into a number of flight
blocks. At the end of each flight block the
test is stopped and the test specimen is
inspected for cracks. These manual
inspections are time consuming and the
time interval between these inspections is
relatively large. Structural abnormalities
may be detected too late and may result
into retrofitting in-service aircraft.
1 // Depending on the
type of test an
overwhelming avalanche
of data can be generated
2 // The Kafka stream
processing engine comes
with an extensive set of
APIs to integrate 3rd party
data streams
“The challenge ahead is not only to
acquire the data, but to store and
preserve large volumes of data”
1
2
/AEROSPACETESTINGINTERNATIONAL.COM