Posted on

UBITECH presents a scientific paper on crowd sensing systems trustworthiness at IEEE MeditCom main theme

The paper “A Perfect Match: Deep Learning Towards Enhanced Data Trustworthiness in Crowd-Sensing Systems” has been accepted for presentation at the main track of the “IEEE International Mediterranean Conference on Communications and Networking” (MeditCom) that will take place 7-10 September 2021 in Athens, Greece. Dr Thanassis Giannetsos, Head of UBITECH’s Digital Security and Trusted Computing Research Group, and his co-authors propose the hybrid use of Deep Learning schemes (i.e., LSTMs) and conventional Machine Learning classifiers (i.e. One-Class Classifiers) for detecting and filtering out false data points due to malevolent input and faulty sensors, so as to cope with the presence of strong, colluding adversaries while at the same time efficiently managing this high influx of incoming user data in the context of the new Mobile Crowd Sensing (MCS) paradigm.

In particular, this paper proposes a novel data verification framework, called False Sequential data Detection (FSD), employing deep learning techniques in the form of neural networks to address the issue of possible data poisoning in MCS environments while demonstrating high accuracy and scalability. The co-authors define malicious data as falsified data exhibiting different statistical properties (from the real data provided by benign users), and their approach is leveraging sequential relationships of sensory data towards detecting malicious samples generated by adversaries or faulty components. More specifically, FSD offers: (i) Data verification by combining Deep Learning sequential architecture of Long Short Term Memories (LSTMs) with conventional one-class classifiers for distinguishing between “false” and “real” sam[1]ples, (ii) Proof-of-concept implementation evaluated under various testing scenarios using both real and synthetic datasets in order to have more flexibility on the type of experiments. FSD demonstrates high accuracy even when the distribution of data comes from adversaries that demonstrate very similar behavior with the legitimate user data, and (iii) Consideration of very strong colluding adversaries by composing two main attack strategies that represent different aspects of adversarial machine learning. In the context of data poisoning attacks (pre-training and post-training), the co-authors define cases where adversaries collude with each other in order to maximize their impact on the classification phase. FSD is still able to cope with these extreme adversarial strategies and achieve a high accuracy.