Abstract
With the exponential growth of data and the high demand for the analysis of large datasets, the MapReduce framework has been widely utilized to process data in a timely, cost-effective manner. It is well-known that the performance of MapReduce is limited by its default configuration parameters, and there are a few research studies that have focused on finding the optimal configurations to improve the performance of the MapReduce framework. Recently, machine learning based approaches have been receiving more attention to be utilized to auto configure the MapReduce parameters to account for the dynamic nature of the applications. In this article, we propose and develop a reinforcement learning (RL)-based scheme, named RL-MRCONF, to automatically configure the MapReduce parameters. Specifically, we explore and experiment with two variations of RL-MRCONF; one variation is based on the traditional RL algorithm and the second is based on the deep RL algorithm. Results obtained from simulations show that the RL-MRCONF has the ability to successfully and effectively auto-configure the MapReduce parameters dynamically according to changes in job types and computing resources. Moreover, simulation results show our proposed RL-MRCONF scheme outperforms the traditional RL-based implementation. Using datasets provided by MR-Perf, simulation results show that our proposed scheme provides around 50% performance improvement in terms of execution time when compared with MapReduce using default settings.
Original language | British English |
---|---|
Article number | 8910465 |
Pages (from-to) | 4183-4196 |
Number of pages | 14 |
Journal | IEEE Transactions on Systems, Man, and Cybernetics: Systems |
Volume | 50 |
Issue number | 11 |
DOIs | |
State | Published - Nov 2020 |
Keywords
- deep
- Deep learning
- machine learning
- MapReduce
- neural networks
- Q-network (DQN)
- reinforcement learning (RL)
- self-configuration