Lihe -> lee-huh.
Li -> lee.
You can just call me Lee.
Hi there, thanks for visiting my website! I am a M.Sc. student (Sep. 2023 - Now) at the School of
Artificial Intelligence at Nanjing
University, where I am fortunate to be advised by Prof. Yang Yu and affiliated with the LAMDA Group led by Prof.
Zhi-Hua Zhou. Specifically, I am a member of the LAMDA-RL Group, which focuses on reinforcement learning research.
Prior to that, I obtained my bachelor's degree at the same school and university in June 2023.
Unity makes strength. Currently my research interest is Reinforcement Learning (RL), especially in Multi-agent
Reinforcement Learning (MARL) that enables agents efficiently, robustly and safely coordinate with other agents🤖 and even humans👨👩👧👦.
Please feel free to drop me an Email for any form of communication or collaboration!
Email:  lilh [at] lamda [dot] nju [dot] edu [dot] cn
 /
@inproceedings{madoc,
title = {Multi-Agent Domain Calibration with a Handful of Offline Data},
author = {Tao Jiang and Lei Yuan and Lihe Li and Cong Guan and Zongzhang Zhang and Yang Yu},
booktitle = {Advances in Neural Information Processing Systems 38},
year = {2024}
}
We formulate domain calibration as a cooperative MARL problem to improve efficiency and fidelity.
Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator
Ruiqi Xue,
Ziqian Zhang,
Lihe Li,
Feng Chen,
Yi-Chen Li,
Yang Yu,
Lei Yuan Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
(ECML PKDD), 2024 pdf / bibtex
@inproceedings{dasar,
title = {Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator},
author = {Ruiqi Xue and Ziqian Zhang and Lihe Li and Feng Chen and Yi-Chen Li and Yang Yu and Lei Yuan},
booktitle = {Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2024}
}
We propose DASaR, which expands the trust region in sim-to-real RL by aligning simulator and real-world value functions through inverse dynamics-based relabeling of rewards and costs.
@inproceedings{core3,
title = {Continual Multi-Objective Reinforcement Learning via Reward Model Rehearsal},
author = {Lihe Li and Ruotong Chen and Ziqian Zhang and Zhichao Wu and Yi-Chen Li and Cong Guan and Yang Yu and Lei Yuan},
booktitle = {Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence},
pages = {4434--4442},
year = {2024}
}
We study the problem of multi-objective reinforcement learning (MORL) with continually evolving
learning objectives, and propose CORe3 to enable the MORL agent rapidly learn new objectives and
avoid catastrophic forgetting about old objectives lacking reward signals.
@inproceedings{haplan,
title = {Efficient Human-{AI} Coordination via Preparatory Language-based Convention},
author = {Cong Guan and Lichao Zhang and Chunpeng Fan and Yi-Chen Li and Feng Chen and Lihe Li and Yunjia Tian and Lei Yuan and Yang Yu},
booktitle = {ICLR 2024 Workshop on Large Language Model (LLM) Agents},
year = {2024}
}
We propose employing the large language models (LLMs) to develop an action plan (or equivalently, a
convention) that effectively guides both human and AI for coordination.
@inproceedings{costa,
title = {Cost-aware Offline Safe Meta Reinforcement Learning with Robust In-Distribution Online Task Adaptation},
author = {Cong Guan and Ruiqi Xue and Ziqian Zhang and Lihe Li and Yi-Chen Li and Lei Yuan and Yang Yu},
booktitle = {Proceedings of the International Conference on Autonomous Agents and Multiagent Systems},
year = {2024}
}
We propose COSTA to deal with offline safe meta RL problems. We develope a cost-aware task
inference module using contrastive learning to distinguish tasks based on safety constraints, and
propose a safe in-distribution online adapation mechanism.
@article{survey,
title = {A Survey of Progress on Cooperative Multi-agent Reinforcement Learning in Open Environment},
author = {Lei Yuan and Ziqian Zhang and Lihe Li and Cong Guan and Yang Yu},
journal = {arXiv preprint arXiv:2312.01058},
year = {2023}
}
We review multi-agent cooperation from closed environment to open environment settings, and provide
prospects for future development and research directions of cooperative MARL in open environments.
@inproceedings{macop,
title = {Learning to Coordinate with Anyone},
author = {Lei Yuan and Lihe Li and Ziqian Zhang and Feng Chen and Tianyi Zhang and Cong Guan and Yang Yu and Zhi-Hua Zhou},
booktitle = {Proceedings of the Fifth International Conference on Distributed Artificial Intelligence},
year = {2023}
}
We propose Multi-agent Compatible Policy Learning (MACOP), where we adopt an agent-centered
teammate generation process that gradually and efficiently generates diverse teammates covering the
teammate policy space, and we use continual learning to train the ego agents to coordinate with them
and acquire strong coordination ability.
@inproceedings{fastap,
title = {Fast Teammate Adaptation in the Presence of Sudden Policy Change},
author = {Ziqian Zhang and Lei Yuan and Lihe Li and Ke Xue and Chengxing Jia and Cong Guan and Chao Qian and Yang Yu},
booktitle = {Uncertainty in Artificial Intelligence},
pages = {2465--2476},
year = {2023}
}
We formulate Open Dec-POMDP and propose Fast teammate adaptation (Fastap) to enable controllable
agents in a multi-agent system to fast adapt to the uncontrollable teammates, whose policy could be
changed with one episode.
@inproceedings{romance,
title = {Robust Multi-agent Coordination via Evolutionary Generation of Auxiliary Adversarial Attackers},
author = {Lei Yuan and Ziqian Zhang and Ke Xue and Hao Yin and Feng Chen and Cong Guan and Lihe Li and Chao Qian and Yang Yu},
booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
pages = {11753--11762},
year = {2023}
}
We formulate Limited Policy Adversary Dec-POMDP and propose ROMANCE to enable the trained agents to
encounter diversified and strong auxiliary adversarial attacks during training, achieving high
robustness under various policy perturbations.
@article{cromac,
title = {Robust Multi-agent Communication via Multi-view Message Certification},
author = {Lei Yuan and Tao Jiang and Lihe Li and Feng Chen and Zongzhang Zhang and Yang Yu},
journal = {SCIENCE CHINA Information Sciences},
year = {2023}
}
We propose CroMAC to enable agents to obtain guaranteed lower bounds on state-action values to
identify and choose the optimal action under a worst-case deviation when the received messages are
perturbed.
@article{macpro,
title = {Multi-agent Continual Coordination via Progressive Task Contextualization},
author = {Lei Yuan and Lihe Li and Ziqian Zhang and Fuxiang Zhang and Cong Guan and Yang Yu},
journal = {IEEE Transactions on Neural Networks and Learning Systems},
year = {2024}
}
We formulate the continual coordination framework and propose MACPro to enable agents to
continually coordinate with each other when the dynamic of the training task and the multi-agent
system itself changes over time.
Education
Nanjing University 2023.09 - present
M.Sc. in Computer Science and Technology Advisor: Prof. Yang Yu
Nanjing University 2019.08 - 2023.07
B.E. in Artificial Intelligence Advisor: Prof. Yang Yu
Awards & Honors
National Scholarship, 2024.
Best Paper Award of The Fifth Distributed Artificial Intelligence Conference (DAI), 2023.
Outstanding Bachelor's Thesis of Nanjing University, 2023.
I have the fortune to work with brilliant people during my
research journey and I am truly grateful for their guidance and help!
My Chinese name is 李立和 (Li Lihe), which can be pronounced as /liː ˈliː hɜː/ in Mandarin or /lei
ˈlʌb wɔː/ in Cantonese. 李 is one of the most common surnames in China, 立 means "stand" or "establish", and 和 means "harmony" and "peace".