Looking for the needle in the haystack

Steinbeis software recognizes big data anomalies

Experts involved in data mining frequently refer to “needle in the haystack” syndrome which highlights the complexity of their field. But finding anomalies in huge volumes of data is actually harder than the proverbial haystack problem. This is because at the beginning of searches it’s not even clear what the needle actually looks like. The first task is to work out if anything stands out in the crowd of existing data and then use this as a basis for further investigation. The Esslingen-based Steinbeis Transfer Center for Software Engineering is conducting research into big data analysis.

These days, automotive companies capture measurements during test drives or in the laboratory almost without thinking – it’s practically a given part of vehicle fault analysis. As part of a long-standing research project, the Esslingen-based Steinbeis Transfer Center for Software Engineering has been investigating different ways to automatically pull out anomalies from mass volumes of data generated during vehicle testing.

Vehicles are fitted with a variety of embedded systems which communicate with each other and interact with the vehicle environment via sensors and actuators. A number of measurements are taken during the testing of research and development vehicles, and these are used for subsequent failure analysis. As a result, millions of measurements are logged but it is impossible to evaluate this data using conventional means. Yet just a single anomaly identified in measurement data is enough to highlight an error in the vehicle’s software, electronics or mechanics. So there are compelling financial reasons why people are interested in recognizing anomalies before cars enter serial production – partly to protect the reputation of manufacturers, partly to avoid expensive recalls.

The Steinbeis experts at the Esslingen-based Transfer Center for Software Engineering examined two fundamental alternatives for recognizing anomalies.

  • The first involves the smart mapping and user-managed evaluation of data, based on visual analytical methods (so-called visual data mining). This works well with fewer recorded data points and oneoff analyses.
  • The other method they examined involves classification methods used in artificial intelligence. An autonomous classifier was created to recognize anomalies automatically and a self-learning system was developed based on one-class support vector machines (support vector data description, or SVDD). This makes it possible to learn from previously gathered reference data and classify new test data automatically.

The team is currently transferring its research findings to industry and the use of these findings is not being restricted to the automotive industry. If anything, the new classification technique acquired from artificial intelligence offers new ways to evaluate data in all areas of industry where technical measurements are made, for example, in automation technology or on test rigs. The experts at the Steinbeis Transfer Center are taking on each successful technique one by one, and adding it to a new kind of measurement analysis software called Tedradis-DataMiner.

Share this page