LearnChain: Transparent and cooperative reinforcement learning on Blockchain

    Research output: Contribution to journalArticlepeer-review

    7 Scopus citations

    Abstract

    We consider multi-agent reinforcement learning (MARL) with the popular paradigm of centralized training and decentralized execution (CTDE). CTDE empowers sharing knowledge from agents in different environments for updating a shared model. A wide range of applications is supported through CTDE in MARL, such as self-driving vehicle coordination, traffic lights synchronization, or cooperation in various aspects of the Internet of Things (IoT), including resource management. Despite the drawbacks of relying on a central authority for handling model updates, incorporating multiple sources of data raises concerns about the trustworthiness of the process. For instance, participating agents could provide data in the favor of their experiences to shift the model towards certain behaviors. Similarly, sending falsified data for updates could lead to adversarial attacks. To overcome these challenges, it is essential to integrate the Ethereum Blockchain technology to handle model updates in the CTDE paradigm by achieving decentralized storage and consensus mechanism for model updates. In the literature, there exist multiple efforts that propose using reinforcement learning (RL) on Blockchain; however, none of them have considered updating MARL of CTDE on-chain, allowing transparent and auditable record of the training process. Therefore, we propose LearnChain, a framework that offers an integration between the CTDE mechanism and a Consortium Blockchain built between authorized participants, thus avoiding gas costs. At the core of LearnChain, RL is integrated with Quorum, offering separate smart contracts for deployment, data handling with incentive mechanisms, training, target update, and inference. Based on a real use-case entailing management of Vehicular Edge Computing tasks through multi-agent synchronization, we implement LearnChain and evaluate its performance and cost in different settings. Our results show the ability to improve learning from shared experiences and to adapt to environment changes on the Quorum BlockChain.

    Original languageBritish English
    Pages (from-to)255-271
    Number of pages17
    JournalFuture Generation Computer Systems
    Volume150
    DOIs
    StatePublished - Jan 2024

    Keywords

    • Blockchain
    • Cooperative artificial intelligence (AI)
    • Ethereum
    • Quorum
    • Reinforcement learning
    • Transparency
    • Trust
    • Vehicular edge computing

    Fingerprint

    Dive into the research topics of 'LearnChain: Transparent and cooperative reinforcement learning on Blockchain'. Together they form a unique fingerprint.

    Cite this