International Journal of Advanced Technology and Engineering Exploration (IJATEE) ISSN (Print): 2394-5443 ISSN (Online): 2394-7454 Volume - 11 Issue - 121 December - 2024

  1. Google Scholar
Quantum prioritized experience replay with MaDi-based priority and quantum circuit mechanisms for optimizing reinforcement learning

R. Palanivel and P. Muthulakshmi

Abstract

Reinforcement learning (RL) encounters significant challenges related to scalability and computational efficiency, particularly in complex decision-making environments. Traditional RL algorithms often struggle with large-scale tasks, necessitating innovative approaches to enhance adaptability and performance. The quantum circuit-based priority replay (QCPR) algorithm was introduced in this study, leveraging quantum-inspired techniques to enhance learning efficiency and decision-making quality through quantum computing (QC). The QCPR algorithm implements two key mechanisms for prioritizing experiences: the magnitude and direction (MaDi) based priority, which assesses the significance and directionality of experiences, and QCPR-specific methods that ensure efficient experience replay. These mechanisms are seamlessly integrated into the Q-learning (QL) framework to optimize the RL process for superior performance. QCPR was evaluated against four well-established RL algorithms-QL, deep Q-networks (DQN), prioritized experience replay (PER), and dueling DQN (DDQN)-using standard decision-making benchmarks. The algorithm was further tested in various simulation and real-world environments, including Atari games, Qiskit integration, and Raspberry Pi hardware and software setups. These evaluations demonstrated QCPR’s adaptability and robustness, showcasing its capability for dynamic, large-scale applications. The results revealed that QCPR significantly outperforms its counterparts across key performance metrics, including accuracy, precision, recall, F1-score, and cumulative rewards. Specifically, QCPR achieved a 15.29% improvement in accuracy over QL, a 9.98% increase over DQN, a 6.33% improvement over PER, and an 8.23% enhancement over DDQN. This study highlights the potential of quantum-inspired approaches to advance RL, offering scalable and efficient solutions for complex decision-making tasks.

Keyword

Quantum computing, MaDi priority, Quantum circuit-based priority replay, Quantum reinforcement learning, Quantum priority experience reply.

Cite this article

Palanivel R, Muthulakshmi P.Quantum prioritized experience replay with MaDi-based priority and quantum circuit mechanisms for optimizing reinforcement learning. International Journal of Advanced Technology and Engineering Exploration. 2024;11(121):1664-1680. DOI:10.19101/IJATEE.2024.111100094

Refference

[1]Nakabi TA, Toivanen P. Deep reinforcement learning for energy management in a microgrid with flexible demand. Sustainable Energy, Grids and Networks. 2021; 25:100413.

[2]Dulac-arnold G, Levine N, Mankowitz DJ, Li J, Paduraru C, Gowal S, et al. Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Machine Learning. 2021; 110(9):2419-68.

[3]Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S. Quantum machine learning. Nature. 2017; 549(7671):195-202.

[4]Skolik A, Jerbi S, Dunjko V. Quantum agents in the gym: a variational quantum algorithm for deep q-learning. Quantum. 2022; 6:720.

[5]Nielsen MA, Chuang IL. Quantum computation and quantum information. Cambridge University Press; 2010.

[6]Padakandla S. A survey of reinforcement learning algorithms for dynamically varying environments. ACM Computing Surveys. 2021; 54(6):1-25.

[7]Xiao H, Chen X, Xu J. Using a deep quantum neural network to enhance the fidelity of quantum convolutional codes. Applied Sciences. 2022; 12(11):1-11.

[8]Saggio V, Asenbeck BE, Hamann A, Strömberg T, Schiansky P, Dunjko V, et al. Experimental quantum speed-up in reinforcement learning agents. Nature. 2021; 591(7849):229-33.

[9]Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. Nature. 2015; 518(7540):529-33.

[10]Preskill J. Quantum computing in the NISQ era and beyond. Quantum. 2018; 2:79.

[11]Schuld M, Petruccione F, Schuld M, Petruccione F. Quantum models as kernel methods. Machine Learning with Quantum Computers. 2021:217-45.

[12]Wei Q, Ma H, Chen C, Dong D. Deep reinforcement learning with quantum-inspired experience replay. IEEE Transactions on Cybernetics. 2021; 52(9):9326-38.

[13]Gonçalves CP. Quantum robotics, neural networks and the quantum force interpretation. Neuro Quantology. 2019; 17(2):33-55.

[14]Jiang H, Gui R, Chen Z, Wu L, Dang J, Zhou J. An improved sarsa (lambda ) reinforcement learning algorithm for wireless communication systems. IEEE Access. 2019; 7:115418-27.

[15]Baritompa WP, Bulger DW, Wood GR. Grovers quantum algorithm applied to global optimization. SIAM Journal on Optimization. 2005; 15(4):1170-84.

[16]Andres E, Cuéllar MP, Navarro G. On the use of quantum reinforcement learning in energy-efficiency scenarios. Energies. 2022; 15(16):1-24.

[17]Fährmann D, Jorek N, Damer N, Kirchbuchner F, Kuijper A. Double deep q-learning with prioritized experience replay for anomaly detection in smart environments. IEEE Access. 2022; 10:60836-48.

[18]Miyajima H, Shigei N, Makino S, Miyajima H, Miyanishi Y, Kitagami S, et al. A proposal of privacy preserving reinforcement learning for secure multiparty computation. Artificial Intelligence Research. 2017; 6(2):57-68.

[19]Lin LJ. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning. 1992; 8:293-321.

[20]Schaul T, Quan J, Antonoglou I, Silver D. Prioritized experience replay. 4th International Conference on Learning Representations 2016(pp. 1-21). ICLR.

[21]Sannia A, Giordano A, Gullo NL, Mastroianni C, Plastina F. A hybrid classical-quantum approach to speed-up Q-learning. Scientific Reports. 2023; 13(1):1-10.

[22]Skolik A, Mangini S, Bäck T, Macchiavello C, Dunjko V. Robustness of quantum reinforcement learning under hardware errors. EPJ Quantum Technology. 2023; 10(1):1-43.

[23]Chen SY. Asynchronous training of quantum reinforcement learning. Procedia Computer Science. 2023; 222:321-30.

[24]Li Z, Zhou Y, Liu Y, Zhu F, Yang C, Hu S. QAP: a quantum-inspired adaptive-priority-learning model for multimodal emotion recognition. In findings of the association for computational linguistics: ACL 2023 (pp. 12191-204). Association for Computational Linguistics.

[25]Tian J, Sun X, Du Y, Zhao S, Liu Q, Zhang K, et al. Recent advances for quantum neural networks in generative learning. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023; 45(10):12321-40.

[26]Chen SY. Quantum deep Q-learning with distributed prioritized experience replay. In international conference on quantum computing and engineering (QCE) 2023 (pp. 31-5). IEEE.

[27]Dong D, Chen C, Li H, Tarn TJ. Quantum reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 2008; 38(5):1207-20.

[28]Lamata L. Quantum machine learning implementations: proposals and experiments. Advanced Quantum Technologies. 2023; 6(7):1-7.

[29]Metz F, Bukov M. Self-correcting quantum many-body control using reinforcement learning with tensor networks. Nature Machine Intelligence. 2023; 5(7):780-91.

[30]Chen SY. Quantum deep recurrent reinforcement learning. In international conference on acoustics, speech and signal processing (ICASSP) 2023 (pp. 1-5). IEEE.

[31] Awad M, Fraihat S. Recursive feature elimination with cross-validation with decision tree: feature selection method for machine learning-based intrusion detection systems. Journal of Sensor and Actuator Networks. 2023; 12(5):1-23.

[32]Sharma NK, Kumar S, Yadav PK. Enhancing infrastructure sustainability: reliability and sensitivity analysis of localized integrated renewable energy systems using feed forward backpropagation neural network. International Journal of Advanced Technology and Engineering Exploration. 2024; 11(110):58-75.

[33]Ghintab SS, Hassan MY. Localization for self-driving vehicles based on deep learning networks and RGB cameras. International Journal of Advanced Technology and Engineering Exploration. 2023; 10(105):1016-36.

[34]Li H, Qian X, Song W. Prioritized experience replay based on dynamics priority. Scientific Reports. 2024; 14(1):1-9.

[35]Wang H, Xiang H. Quantum optimization algorithm based on multistep quantum computation. New Journal of Physics. 2024; 26(7):1-16.

[36]Roik J, Bartkiewicz K, Černoch A, Lemr K. Routing in quantum communication networks using reinforcement machine learning. Quantum Information Processing. 2024; 23(3):89.

[37]Vandelli M, Lignarolo A, Cavazzoni C, Dragoni D. Evaluating the practicality of quantum optimization algorithms for prototypical industrial applications. Quantum Information Processing. 2024; 23(10):1-14.

[38]Patil S, Banerjee S, Panigrahi PK. NISQ-friendly measurement-based quantum clustering algorithms. Quantum Information Processing. 2024; 23(10):341.

[39]Zhao LY, Chang TQ, Zhang L, Zhang J, Chu KX, Kong DP. Targeted multi-agent communication algorithm based on state control. Defence Technology. 2022; 31: 544-56.

[40]Palanivel R, Muthulakshmi P. Error mitigation using quantum neural Q network in secure qutrit distribution on cleves protocol on quantum computing. Quantum Information Processing. 2024; 23(4):1-30.

[41]Mangla C, Rani S, Abdelsalam A. QLSN: quantum key distribution for large scale networks. Information and Software Technology. 2024; 165:107349.

[42]Kundu A, Bedełek P, Ostaszewski M, Danaci O, Patel YJ, Dunjko V, et al. Enhancing variational quantum state diagonalization using reinforcement learning techniques. New Journal of Physics. 2024; 26(1):1-18.

[43]Yun WJ, Kim JP, Jung S, Kim JH, Kim J. Quantum multiagent actor–critic neural networks for internet-connected multirobot coordination in smart factory management. IEEE Internet of Things Journal. 2023; 10(11):9942-52.

[44]Peral-garcía D, Cruz-benito J, García-peñalvo FJ. Systematic literature review: quantum machine learning and its applications. Computer Science Review. 2024; 51:1-20.

[45]Hellstern G, Dehn V, Zaefferer M. Quantum computer based feature selection in machine learning. IET Quantum Communication. 2024; 5(3):232-52.

[46]Palanivel R, Muthulakshmi P. Design and analysis of quantum transfer fractal priority replay and mirdad priority loss algorithms for quantum reinforcement learning. In international conference on data management, analytics & innovation 2024 (pp. 409-24). Singapore: Springer Nature Singapore.

[47]Palanivel R, Muthulakshmi P. Design and analysis of parallel quantum transfer fractal priority replay with dynamic memory algorithm in quantum reinforcement learning for robotics. IET Quantum Communication. 2024:1-24.

[48]Van SH, Van HH, Whiteson S, Wiering M. A theoretical and empirical analysis of expected Sarsa. In symposium on adaptive dynamic programming and reinforcement learning 2009 (pp. 177-84). IEEE.

[49]Dong D, Chen C, Li H, Tarn TJ. Quantum reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 2008; 38(5):1207-20.

[50]Barnard E. Temporal-difference methods and Markov models. IEEE Transactions on Systems, Man, and Cybernetics. 1993; 23(2):357-65.

[51]Yao J, Bukov M, Lin L. Policy gradient based quantum approximate optimization algorithm. In mathematical and scientific machine learning 2020 (pp. 605-34). PMLR.

[52]Semola R, Moro L, Bacciu D, Prati E. Deep reinforcement learning quantum control on IBMQ platforms and Qiskit pulse. In international conference on quantum computing and engineering (QCE) 2022 (pp. 759-62). IEEE.

[53]Dalla PN, Buffoni L, Martina S, Caruso F. Quantum reinforcement learning: the maze problem. Quantum Machine Intelligence. 2022; 4(1):1-10.

[54]Schuld M, Killoran N. Quantum machine learning in feature Hilbert spaces. Physical Review Letters. 2019; 122(4):040504.

[55]Mitarai K, Negoro M, Kitagawa M, Fujii K. Quantum circuit learning. Physical Review A. 2018; 98(3):1-6.

[56]Chintala P, Dornberger R, Hanne T. Robotic path planning by Q learning and a performance comparison with classical path finding algorithms. International Journal of Mechanical Engineering and Robotics Research. 2022; 11(6):373-8.

[57]Zhou X, Feng Y, Li S. A monte carlo tree search framework for quantum circuit transformation. In proceedings of the 39th international conference on computer-aided design 2020 (pp. 1-7). ACM.

[58]Zhao C, Gao XS. QDNN: deep neural networks with quantum layers. Quantum Machine Intelligence. 2021; 3(1):1-9.

[59]Payares ED, Martínez-santos JC. Parallel quantum computation approach for quantum deep learning and classical-quantum models. In journal of physics: conference series 2021 (pp. 1-11). IOP Publishing.

[60]Hu W, Hu J. Q learning with quantum neural networks. Natural Science. 2019; 11(1):31-9.

[61]Fang P, Zhang C, Situ H. Quantum state clustering algorithm based on variational quantum circuit. Quantum Information Processing. 2024; 23(4):125.

[62]Uehara GS, Spanias A, Clark W. Quantum information processing algorithms with emphasis on machine learning. In 12th international conference on information, intelligence, systems & applications (IISA) 2021 (pp. 1-11). IEEE.

[63]Yan R, Wang Y, Xu Y, Dai J. A multiagent quantum deep reinforcement learning method for distributed frequency control of islanded microgrids. IEEE Transactions on Control of Network Systems. 2022; 9(4):1622-32.

[64]Cleve R. An introduction to quantum complexity theory. Collected Papers on Quantum Computation and Quantum Information Theory. 2000:103-27.

[65]Peng X, Chen R, Zhang J, Chen B, Tseng HW, Wu TL, et al. Enhanced autonomous navigation of robots by deep reinforcement learning algorithm with multistep method. Sensors & Materials. 2021; 33(2):825-42.

[66]Dong D, Chen C, Chu J, Tarn TJ. Robust quantum-inspired reinforcement learning for robot navigation. IEEE/ASME Transactions on Mechatronics. 2010; 17(1):86-97.