Abstract
In recent years, Federated Learning (FL) and Blockchain (BC) have emerged as pivotal technologies for enabling decentralized and privacy-preserving machine learning (ML) across various domains. FL allows models to be trained locally on user devices without sharing raw data, thereby preserving privacy, but its reliance on a central aggregator presents concerns regarding trust, transparency, and vulnerability to data poisoning attacks. Recent studies have demonstrated that weight updates and gradients, which are commonly exchanged in FL, can potentially reveal sensitive information about the underlying datasets. To address these issues, a blockchain-based FL framework is introduced, where the blockchain manages the entire federated learning process. In this framework, participants train their local models, apply the novel hash-comb technique to privatize model updates, and share the corresponding IPFS hashes after uploading hashed local model updates, which are then uploaded to the blockchain. Oracles along with the participants themselves, validate these updates to detect suspicious behavior. The blockchain maintains a record of how often each participant is flagged as suspicious per round, enabling dynamic participant evaluation. Based on oracle feedback and participant collaboration, aggregation is performed collaboratively by oracles and participants, who combine selected model weights. The blockchain then stores and announces the updated global model weights. This design ensures a fully decentralized, transparent, and secure FL process managed entirely on-chain. The framework incorporates hash-comb techniques to privatize model updates and store them via IPFS, ensuring tamper-resistant and scalable data management. Smart contracts validate gradient or weight submissions using covariance estimation for changing datasets in each round or apply statistical methods such as Jaccard similarity and Isolation Forest for static data scenarios, while also communicating with decentralized oracles for external validation and enforcing a dual-layer incentive and reputation system. Participants who behave honestly receive incentives based on a fairness score, while those excluded from training rounds lose reputation proportional to their harmful impact on the global model. An extensive evaluation of poisoning attacks was conducted under different heterogeneous FL environments, considering data distributions, number of attackers, dataset poisoning rates, ML models, and various attacks ranging from simple to sophisticated, encountering both targeted and untargeted attacks. The findings demonstrate that heterogeneity in FL significantly impacts the overall performance of the global model. In several cases, targeted and sophisticated attacks led to lower accuracy or caused significant degradation with fewer malicious participants or lower poisoning rates, making the system more vulnerable compared to untargeted or random attacks. The study further investigates the use of hash-comb techniques and their impact on model performance after aggregation. Results demonstrate that hash-combs effectively preserve the integrity of the original global model, ensuring accuracy is maintained despite the application of hashing. Additionally, for detecting suspicious behavior based on analyzing hashed model updates, we examined two approaches for static data: (1) Jaccard similarity combined with KL divergence, and (2) Jaccard similarity combined with Isolation Forest. Both approaches saw a decline in effectiveness as the number of attackers increased to more than half of the participants, but the second approach performed significantly better in all scenarios. For dynamic data, covariance estimation was employed and exhibited a high detection rate, even with many participants.Nonetheless, these methods provide promising foundations for further enhancing robustness in poisoning detection. Furthermore, the practical realization of blockchain implementation was explored using a simulated blockchain environment, offering insights into real-world deployment and system integration challenges. Compared to the literature, the blockchain-based FL framework offers: 1) A simple and lightweight mechanism for private model updates using hash-combs; 2) Secure and scalable storage of hashed updates on IPFS, reducing blockchain load and avoiding direct on-chain storage of all updates; 3) Two statistical detection techniques tailored for both static and dynamic data scenarios; 4) A compatible framework that supports both privacy preservation and poisoning attack detection; 5) A novel double-sided poisoning check mechanism, allowing participants and blockchain-oracle interaction for poisoning validation; and 6) A distinctive incentive and reputation scheme tightly integrated with the proposed system. To summarize, this work presents a blockchain-based federated learning architecture that introduces the novel hash-comb approach for securing and verifying model updates, combined with advanced poisoning detection techniques and an adaptive incentive and reputation system. Unlike existing frameworks, our architecture offers a more comprehensive and robust solution by integrating dynamic participant evaluation, enhanced attack detection using both static and dynamic analysis, a double-sided poisoning check mechanism, and fairness-based incentives. This combination provides a promising direction for the real-world deployment of decentralized, secure, and privacy-preserving federated learning systems.
| Date of Award | 2025 |
|---|---|
| Original language | American English |
| Supervisor | ERNESTO Damiani (Supervisor) |
Keywords
- Federated Learning
- Blockchain
- Privacy Preservation
- Poisoning Attacks
- Robustness
- Aggregation Techniques
- Decentralized
Cite this
- Standard