Shared Publication

data acquisition systems “They are virtually the engine around which the entire concept is built.” NEW STREAM PROCESSING PLATFORM Apache Kafka is a messaging system that enables the data received from instruments to be queued and made highly available to follow-on systems. Kafka looks at every single received value and analyses it for current measures, “stream processing”. However, in order to capture, store and make accessible to follow-on systems, the enormous volumes of data requires a database that offers the appropriate performance and interfaces that allow fast and convenient access. The CrateDB is a new kind of distributed SQL database that improves the handling of time series analysis. The use of SQL as a query language simplifies application and integration, and NoSQL base technology allows you to process IoT data in a variety of formats. The CrateDB can hold hundreds of terabytes of data and, thanks to the shared-nothing architecture within server clusters, guarantees real-time availability without data loss or downtime. Condition-based inspection of the test specimen, instead of interval-based inspection, is a potential solution to reduce the total fatigue test duration and to quickly detect abnormalities. One of the implications is that more sensors are required to monitor the behavior of the test specimen and to detect or predict structural failures. As a full-scale fatigue test can generate data at rates of up to 10MB/s, totalling to hundreds of terabytes at completion, data processing and analysis have become a major bottleneck. In order to capture, analyse, and store the rather enormous volume of data, and to ensure its available for applications, Gantner turned to a combination of Apache Kafka (data streaming) and CrateDB (distributed NoSQL database built for IoT/industrial use cases). CrateDB is used for real-time hot storage and Kafka for cost-effective, document-based storage. “After extensive research and comparisons, we decided to use the combination of Apache Kafka and CrateDB for the design of the data backend,” explains Jürgen Sutterlüti, head of cloud and data analytics at Gantner Instruments. AEROSPACETESTINGINTERNATIONAL.COM // SHOWCASE 121 infrastructure in the test lab, whilst maintaining the necessary computing performance for test-critical data analysis tasks. Aircraft engine testing is a typical use case where a scalable data backend offers major advantages. Engine testing generates a lot of data, especially when engine transient responses must be recorded. Data rates can vary from 10 samples/second up to 100,000 samples/second. The challenge is to store massive amounts of sensor data, keep it available on a 24/7 basis and allow rapid data analysis. Another example is where a scalable data backend proves its advantages is fatigue testing of large components or fullscale structures. A typical fatigue test program is divided into a number of flight blocks. At the end of each flight block the test is stopped and the test specimen is inspected for cracks. These manual inspections are time consuming and the time interval between these inspections is relatively large. Structural abnormalities may be detected too late and may result into retrofitting in-service aircraft. 1 // Depending on the type of test an overwhelming avalanche of data can be generated 2 // The Kafka stream processing engine comes with an extensive set of APIs to integrate 3rd party data streams “The challenge ahead is not only to acquire the data, but to store and preserve large volumes of data” 1 2 /AEROSPACETESTINGINTERNATIONAL.COM