Full Publications

← Back to Home

Publications

Full list of publications. Also available on Google Scholar.

2025

	LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination Lihe Li, Lei Yuan, Pengsen Liu, Tao Jiang, Yang Yu The 42rd International Conference on Machine Learning (ICML), 2025 pdf / link / code / bibtex `@inproceedings{semdiv, title = {LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination}, author = {Lihe Li and Lei Yuan and Pengsen Liu and Tao Jiang and Yang Yu}, booktitle = {Proceedings of the Forty-second International Conference on Machine Learning}, year = {2025} }` Instead of discovering novel teammates only at the policy level, we utilize LLMs to propose novel coordination behaviors described in natural language, and then transform them into teammate policies, enhancing teammate diversity and interpretability, eventually learning agents with language comprehension ability and stronger collaboration skills.
	Learning to Reuse Policies in State Evolvable Environments Ziqian Zhang, Bohan Yang, Lihe Li, Yuqi Bian, Ruiqi Xue, Feng Chen, Yi-Chen Li, Lei Yuan, Yang Yu The 42rd International Conference on Machine Learning (ICML), 2025 pdf / link / bibtex `@inproceedings{lapse, title = {Learning to Reuse Policies in State Evolvable Environments}, author = {Ziqian Zhang and Bohan Yang and Lihe Li and Yuqi Bian and Ruiqi Xue and Feng Chen and Yi-Chen Li and Lei Yuan and Yang Yu}, booktitle = {Proceedings of the Forty-second International Conference on Machine Learning}, year = {2025} }` We addresse the performance degradation of RL policies when state features (e.g., sensor data) evolve unpredictably by proposing Lapse, a method that reuses old policies by combining them with a state reconstruction model for vanished sensors and leverages past policy experience for offline training of new policies.
	Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching Lei Yuan, Yuqi Bian, Lihe Li, Ziqian Zhang, Cong Guan, Yang Yu The 13th International Conference on Learning Representations (ICLR), 2025 pdf / link / bibtex `@inproceedings{madits, title = {Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching}, author = {Lei Yuan and Yuqi Bian and Lihe Li and Ziqian Zhang and Cong Guan and Yang Yu}, booktitle = {The Thirteenth International Conference on Learning Representations}, year = {2025} }` We propose a data augmentation technique for offline cooperative MARL, utlizing diffusion models to improve the quality of the datasets.
	Safe Multi-task Pretraining with Constraint Prioritized Decision Transformer Ruiqi Xue, Ziqian Zhang, Lihe Li, Lei Yuan, Yang Yu pdf / link / bibtex `@inproceedings{smacot, title = {Safe Multi-task Pretraining with Constraint Prioritized Decision Transformer}, author = {Ruiqi Xue and Ziqian Zhang and Lihe Li and Lei Yuan and Yang Yu}, year = {2025}, url ={https://openreview.net/forum?id=CbPifku2Un} }` SMACOT enables safe offline RL by prioritizing constraints in Decision Transformers, achieving 2× higher safety compliance than baselines across tasks.
	Haland: Human-AI Coordination via Policy Generation from Language-guided Diffusion Lei Yuan, Kunmin Lin, Ziqian Zhang, Lihe Li, Feng Chen, Jingyu Ru, Cong Guan, Yang Yu pdf / link / bibtex `@inproceedings{haland, title = {Haland: Human-{AI} Coordination via Policy Generation from Language-guided Diffusion}, author = {Lei Yuan and Kunmin Lin and Ziqian Zhang and Lihe Li and Feng Chen and Jingyu Ru and Cong Guan and Yang Yu}, year = {2025}, url ={https://openreview.net/forum?id=XCUTFbC3Rh} }` By compressing diverse best-response policies into a language-conditioned diffusion model, Haland efficiently aligns human preferences with AI behavior for seamless collaboration.

2024

	Generalizable Offline Multi-Objective Reinforcement Learning via Preference-Conditioned Diffuser Lei Yuan, Yuchen Xiao, Lihe Li, Ziqian Zhang, Yi-Chen Li, Yang Yu IEEE Transactions on Neural Networks and Learning Systems (TNNLS) pdf / link / bibtex `@misc{diffmorl, title = {Boosting Offline Multi-Objective Reinforcement Learning via Preference Conditioned Diffusion Models}, author = {Lei Yuan and Yuchen Xiao and Lihe Li and Ziqian Zhang and Yi-Chen Li and Yang Yu}, booktitle = {The Thirteenth International Conference on Learning Representations}, year = {2024}, url ={https://openreview.net/forum?id=XCUTFbC3Rh} }` DiffMORL advances offline multi-objective RL with a diffusion-based planning framework, enhancing generalization via data mixup and outperforming baselines on OOD preferences.
	Multi-Agent Domain Calibration with a Handful of Offline Data Tao Jiang, Lei Yuan, Lihe Li, Cong Guan, Zongzhang Zhang, Yang Yu Advances in Neural Information Processing Systems 38 (NeurIPS), 2024 pdf / link / code / bibtex `@inproceedings{madoc, title = {Multi-Agent Domain Calibration with a Handful of Offline Data}, author = {Tao Jiang and Lei Yuan and Lihe Li and Cong Guan and Zongzhang Zhang and Yang Yu}, booktitle = {Advances in Neural Information Processing Systems 38}, pages = {69607--69636}, year = {2024} }` We formulate domain calibration as a cooperative MARL problem to improve efficiency and fidelity.
	Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator Ruiqi Xue, Ziqian Zhang, Lihe Li, Feng Chen, Yi-Chen Li, Yang Yu, Lei Yuan Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2024 pdf / link / bibtex `@inproceedings{dasar, title = {Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator}, author = {Ruiqi Xue and Ziqian Zhang and Lihe Li and Feng Chen and Yi-Chen Li and Yang Yu and Lei Yuan}, booktitle = {Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases}, pages = {74-91}, year = {2024} }` We propose DASaR, which expands the trust region in sim-to-real RL by aligning simulator and real-world value functions through inverse dynamics-based relabeling of rewards and costs.
	Continual Multi-Objective Reinforcement Learning via Reward Model Rehearsal Lihe Li, Ruotong Chen, Ziqian Zhang, Zhichao Wu, Yi-Chen Li, Cong Guan, Yang Yu, Lei Yuan The 33rd International Joint Conference on Artificial Intelligence (IJCAI), 2024 pdf / link / code / talk / poster / bibtex `@inproceedings{core3, title = {Continual Multi-Objective Reinforcement Learning via Reward Model Rehearsal}, author = {Lihe Li and Ruotong Chen and Ziqian Zhang and Zhichao Wu and Yi-Chen Li and Cong Guan and Yang Yu and Lei Yuan}, booktitle = {Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence}, pages = {4434--4442}, year = {2024} }` We study the problem of multi-objective reinforcement learning (MORL) with continually evolving learning objectives, and propose CORe3 to enable the MORL agent rapidly learn new objectives and avoid catastrophic forgetting about old objectives lacking reward signals.
	Efficient Human-AI Coordination via Preparatory Language-based Convention Cong Guan, Lichao Zhang, Chunpeng Fan, Yi-Chen Li, Feng Chen, Lihe Li, Yunjia Tian, Lei Yuan, Yang Yu The 12th International Conference on Learning Representations (ICLR), Workshop on Large Language Model (LLM) Agents, 2024 pdf / link / bibtex `@inproceedings{haplan, title = {Efficient Human-{AI} Coordination via Preparatory Language-based Convention}, author = {Cong Guan and Lichao Zhang and Chunpeng Fan and Yi-Chen Li and Feng Chen and Lihe Li and Yunjia Tian and Lei Yuan and Yang Yu}, booktitle = {ICLR 2024 Workshop on Large Language Model (LLM) Agents}, year = {2024} }` We propose employing the large language models (LLMs) to develop an action plan (or equivalently, a convention) that effectively guides both human and AI for coordination.
	Cost-aware Offline Safe Meta Reinforcement Learning with Robust In-Distribution Online Task Adaptation Cong Guan, Ruiqi Xue, Ziqian Zhang, Lihe Li, Yi-Chen Li, Lei Yuan, Yang Yu The 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2024 pdf / link / code / poster / bibtex `@inproceedings{costa, title = {Cost-aware Offline Safe Meta Reinforcement Learning with Robust In-Distribution Online Task Adaptation}, author = {Cong Guan and Ruiqi Xue and Ziqian Zhang and Lihe Li and Yi-Chen Li and Lei Yuan and Yang Yu}, booktitle = {Proceedings of the International Conference on Autonomous Agents and Multiagent Systems}, pages = {743-751}, year = {2024} }` We propose COSTA to deal with offline safe meta RL problems. We develope a cost-aware task inference module using contrastive learning to distinguish tasks based on safety constraints, and propose a safe in-distribution online adapation mechanism.

2023

	A Survey of Progress on Cooperative Multi-agent Reinforcement Learning in Open Environment Lei Yuan, Ziqian Zhang, Lihe Li, Cong Guan, Yang Yu Science China Information Sciences (SCIS) pdf in English / pdf in Chinese / link / bibtex `@article{survey, title = {A Survey of Progress on Cooperative Multi-agent Reinforcement Learning in Open Environment}, author = {Lei Yuan and Ziqian Zhang and Lihe Li and Cong Guan and Yang Yu}, journal = {Science China Information Sciences (SCIS)}, year = {2023} }` We review multi-agent cooperation from closed environment to open environment settings, and provide prospects for future development and research directions of cooperative MARL in open environments.
	Learning to Coordinate with Anyone Lei Yuan, Lihe Li, Ziqian Zhang, Feng Chen, Tianyi Zhang, Cong Guan, Yang Yu, Zhi-Hua Zhou Proceedings of the Fifth International Conference on Distributed Artificial Intelligence (DAI), Best Paper Award, 2023 pdf / link / code / English talk / Chinese talk / bibtex `@inproceedings{macop, title = {Learning to Coordinate with Anyone}, author = {Lei Yuan and Lihe Li and Ziqian Zhang and Feng Chen and Tianyi Zhang and Cong Guan and Yang Yu and Zhi-Hua Zhou}, booktitle = {Proceedings of the Fifth International Conference on Distributed Artificial Intelligence}, year = {2023} }` We propose Multi-agent Compatible Policy Learning (MACOP), where we adopt an agent-centered teammate generation process that gradually and efficiently generates diverse teammates covering the teammate policy space, and we use continual learning to train the ego agents to coordinate with them and acquire strong coordination ability.
	Fast Teammate Adaptation in the Presence of Sudden Policy Change Ziqian Zhang, Lei Yuan, Lihe Li, Ke Xue, Chengxing Jia, Cong Guan, Chao Qian, Yang Yu The 39th Conference on Uncertainty in Artificial Intelligence (UAI), 2023 pdf / link / poster / bibtex `@inproceedings{fastap, title = {Fast Teammate Adaptation in the Presence of Sudden Policy Change}, author = {Ziqian Zhang and Lei Yuan and Lihe Li and Ke Xue and Chengxing Jia and Cong Guan and Chao Qian and Yang Yu}, booktitle = {Uncertainty in Artificial Intelligence}, pages = {2465--2476}, year = {2023} }` We formulate Open Dec-POMDP and propose Fast teammate adaptation (Fastap) to enable controllable agents in a multi-agent system to fast adapt to the uncontrollable teammates, whose policy could be changed with one episode.
	Robust Multi-agent Coordination via Evolutionary Generation of Auxiliary Adversarial Attackers Lei Yuan, Ziqian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Lihe Li, Chao Qian, Yang Yu The 37th AAAI Conference on Artificial Intelligence (AAAI), Oral Presentation, 2023 pdf / link / code / poster / bibtex `@inproceedings{romance, title = {Robust Multi-agent Coordination via Evolutionary Generation of Auxiliary Adversarial Attackers}, author = {Lei Yuan and Ziqian Zhang and Ke Xue and Hao Yin and Feng Chen and Cong Guan and Lihe Li and Chao Qian and Yang Yu}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence}, pages = {11753--11762}, year = {2023} }` We formulate Limited Policy Adversary Dec-POMDP and propose ROMANCE to enable the trained agents to encounter diversified and strong auxiliary adversarial attacks during training, achieving high robustness under various policy perturbations.
	Robust Multi-agent Communication via Multi-view Message Certification Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu Science China Information Sciences (SCIS) pdf / link / code / poster / bibtex `@article{cromac, title = {Robust Multi-agent Communication via Multi-view Message Certification}, author = {Lei Yuan and Tao Jiang and Lihe Li and Feng Chen and Zongzhang Zhang and Yang Yu}, journal = {SCIENCE CHINA Information Sciences}, volume = {67}, number = {4}, pages = {142102:1-142102:15}, year = {2024} }` We propose CroMAC to enable agents to obtain guaranteed lower bounds on state-action values to identify and choose the optimal action under a worst-case deviation when the received messages are perturbed.
	Multi-agent Continual Coordination via Progressive Task Contextualization Lei Yuan, Lihe Li, Ziqian Zhang, Fuxiang Zhang, Cong Guan, Yang Yu IEEE Transactions on Neural Networks and Learning Systems (TNNLS) pdf / link / code / poster / bibtex `@article{macpro, title = {Multi-agent Continual Coordination via Progressive Task Contextualization}, author = {Lei Yuan and Lihe Li and Ziqian Zhang and Fuxiang Zhang and Cong Guan and Yang Yu}, journal = {IEEE Transactions on Neural Networks and Learning Systems}, volume = {36}, number = {4}, pages = {6326-6340}, year = {2025} }` We formulate the continual coordination framework and propose MACPro to enable agents to continually coordinate with each other when the dynamic of the training task and the multi-agent system itself changes over time.