Hierarchical Federated Learning-Based Anomaly Detection Using Digital Twins For Internet of Medical Things (IoMT)

On Jan 1, 2022

Smart healthcare services can be provided by using Internet of Things (IoT) technologies that monitor the health conditions of patients and their vital body parameters. The majority of IoT solutions used to enable such services are wearable devices, such as smartwatches, ECG monitors, and blood pressure monitors. The huge amount of data collected from smart medical devices leads to major security and privacy issues in the IoT domain. Considering Remote Patient Monitoring (RPM) applications, we will focus on Anomaly Detection (AD) models, whose purpose is to identify events that differ from the typical user behavior patterns. Generally, while designing centralized AD models, the researchers face security and privacy challenges (e.g., patient data privacy, training data poisoning).

To overcome these issues, the researchers of this paper propose an Anomaly Detection (AD) model based on Federated Learning (FL). Federated Learning (FL) allows different devices to collaborate and perform training locally in order to build Anomaly Detection (AD) models without sharing patients’ data. Specifically, the researchers propose a hierarchical Federated Learning (FL) that enables collaboration among different organizations, by building various Anomaly Detection (AD) models for patients with similar health conditions.

*source: https://arxiv.org/pdf/2111.12241.pdf*

The figure above depicts a centralized Anomaly Detection (AD) model that can be used by clinicians to remotely monitor their patients. In this architecture, the Anomaly Detection (AD) model and the dataset are both in the same health organization cloud, a solution that facilitates model training and read/write operations. However, this approach leads to different weaknesses, such as having a single point of failure and data privacy issues since data from multiple sources are stored at the same location. Let’s discuss some of the possible threats scenarios associated with centralized Anomaly Detection (AD) solutions.

1. Privacy Leakage: malicious users may gain access to the sensitive health information of any patient. At the same time, the patients should have control over their data. The proposed hierarchical Federated Learning (FL) solution ensures that only the local Anomaly Detection (AD) model assigned to a group of patients has access to their data, without sharing them with the Anomaly Detection (AD) models of the other patients.

2. Training Data Poisoning: an attacker could poison the training set by introducing data samples that impact the recognition rate of the model. Since with the Federated Learning (FL) paradigm the training data are stored locally on each client, it is harder to poison such localized data

3. Model Drift: generally, model drift occurs when the underlying statistical structure of the data changes over time. An attacker could alter this statistical structure by introducing specific data points within the training set. The proposed hierarchical FL solution relies on a disease-based grouping mechanism, ensuring that data generated by each patient will not dramatically affect the statistical structure of the data.

4. Performance Overhead: a single point of failure could bring to the loss of data and high response time that can affect the performance of an Anomaly Detection (AD) model. A Federated Learning (FL) solution, for example, can improve the response time by performing decentralized training on each client’s device.

The figure above depicts the proposed Anomaly Detection (AD) model based on Federated Learning (FL). Each of the N participants has its own locally stored dataset that includes data collected by a set of smart IoT devices (e.g., wearable glucose meter, smartwatch). The samples of the dataset are labeled to detect “normal” and “abnormal” observations. Each participant trains a local model and then sends the model weights to the federated cloud server. Hence, this server receives the weights from all the participants’ local models. Then, it aggregates these weights based on specific attributes (e.g., user’s age, disease name) and sends them back to the participants as a global weight.

Immagine che contiene testo, cielo, luce, screenshot

Descrizione generata automaticamente — *source: https://arxiv.org/pdf/2111.12241.pdf*

The neural network used for Anomaly Detection (AD) is a Time Distributed LSTM, depicted in the figure above. The proposed model presents two stacked layers of four LSTM cells arranged sequentially. Hence, local training on each device is performed through sequences of four input samples given as input to the neural network. This training process is carried out until the recognition error is below a specific threshold, or after H epochs. As already described before, the weights of the local models are then uploaded to the server for aggregation. After receiving the global weights from the federated server, each device continues the training. The whole process shown in the figure below continues until the N local models are optimized.

Immagine che contiene mappa

Descrizione generata automaticamente — *source: https://arxiv.org/pdf/2111.12241.pdf*

Finally, let’s focus on the RPM use case that uses the proposed Federated Learning (FL) framework, as illustrated in the figure above. This use case presents a scenario where patients are continuously monitored by clinicians. We consider two smart healthcare organizations. Bob, Alice, John, and Paul belong to smart healthcare organization-1, while Susan and Max belong to smart healthcare organization-2. Bob, Alice, Susan, and Max have been diagnosed with Obstructive Sleep Apnea (OSA) disease, while John and Paul are Diabetics (DB). For each patient, the data captured by the IoT devices are sent to a Digital Twins service that builds a Digital Twin (DT), a digital representation of the patient. The clinicians have access to their patients’ data through the DTs. The proposed hierarchical framework allows collaboration among multiple health organizations. For instance, Bob and Alice, who are OSA patients of the smart healthcare organization-1, collaborate by exchanging their local weights to build an Anomaly Detection (AD) model. At the same time, they also collaborate with the OSA patients of smart healthcare organization-2 (Susan and Max). This approach allows participants to enhance the recognition rate of their local models by feeding the neural network with new samples, provided by other patients with similar characteristics.