Multi-UAV Multi-Target Cooperative Interception via Multi-Agent RL with Reward Shaping and LLM-Based Curriculum Learning |
| 3 December 2025, Wednesday, 2:00pm | Speaker: Mr. Song Yunze, Master’s student, National University of Singapore |
| Venue: Seminar Room 8D-1, Level 8, Temasek Laboratories | Event Organiser Host: Dr. Chin Yao Wei |
ABSTRACT |
Multi-UAV cooperative interception of multiple evasive targets is a critical problem in defense and security applications. We address the challenge of maximizing the interception success rate within a limited time horizon using a team of autonomous UAVs against maneuvering targets headed toward a protected base. We propose a multi-agent reinforcement learning (MARL) approach based on Proximal Policy Optimization (PPO) with centralized training and decentralized execution (MAPPO). Our method incorporates potential-based reward shaping to inject domain knowledge and a four-stage large-language-model-based (LLM-based) curriculum learning scheme to gradually increase task difficulty. A local greedy heuristic serves as a baseline for comparison. Simulation results in a high-fidelity environment demonstrate that the proposed curriculum-guided MARL significantly improves the target neutralization rate and reduces missed targets compared to the greedy baseline (approximately 81.1% vs 66.2% success rate). The learned policy exhibits stable training curves and coordinated UAV behaviors that efficiently distribute targets among team members. These findings highlight the benefit of combining reward shaping and LLM-based curriculum learning in MARL for complex multi-UAV multi-target interception tasks. |
| ABOUT THE SPEAKER |
Yunze Song is a Master’s student in Robotics at the National University of Singapore, supervised by Prof. Khoo Boo Cheong and Dr. Sutthiphong Srigrarom. Their research lies at the cooperative multi-UAV interception and trustworthy LLM-based systems. They received a B.Eng. in Computer Science and Technology with First Class Honours from Xi’an Jiaotong-Liverpool University and the University of Liverpool, and have contributed as first or co-author to work on large language models and data evaluation in venues such as EMNLP, NAACL, VLDB Journal and MDPI Drones.
|
