Research

Research Highlights

EE Prof. Changdong Yoo’s research team develops next-generation reinforcement learning frameworks for Physical AI: ‘ERL-VLM’ and ‘PLARE’

교수님 팀 1
< (From left) PhD candidate Luu Minh Tung, MS student Younghwan Lee, , MS student Donghoon Lee and Professor Chang D. Yoo >

With recent advancements in artificial intelligence’s ability to understand both language and visual information, there is growing interest in Physical AIAI systems that can comprehend high-level human instructions and perform physical tasks such as object manipulation or navigation in the real world. Physical AI integrates large language models (LLMs), vision-language models (VLMs), reinforcement learning (RL), and robot control technologies, and is expected to become a cornerstone of next-generation intelligent robotics.

 

To advance research in Physical AI, an EE research team led by Professor Chang D. Yoo (U-AIM: Artificial Intelligence & Machine Learning Lab) has developed two novel reinforcement learning frameworks leveraging large vision-language models. The first, introduced in ICML 2025, is titled ERL-VLM (Enhancing Rating-based Learning to Effectively Leverage Feedback from Vision-Language Models). In this framework, a VLM provides absolute rating-based feedback on robot behavior, which is used to train a reward function. That reward is then used to learn a robot control AI model. This method removes the need for manually crafting complex reward functions and enables the efficient collection of large-scale feedback, significantly reducing the time and cost required for training.

 

Inline image 2025 07 31 14.39.29.062
<Figure 1. ERL-VLM framework>

 

The second, published in IROS 2025, is titled PLARE (Preference-based Learning from Vision-Language Model without Reward Estimation). Unlike previous approaches, PLARE skips reward modeling entirely and instead uses pairwise preference feedback from a VLM to directly train the robot control AI model. This makes the training process simpler and more computationally efficient, without compromising performance.

 

Inline image 2025 07 31 14.41.28.258
<Figure 2. PLARE framework>

 

Both frameworks demonstrated superior performance not only in simulation environments but also in real-world experiments using physical robots, achieving higher success rates and more stable behavior than existing methods—thereby verifying their practical applicability.

 

Inline image 2025 07 31 15.34.45.846
<Figure 4. (From left) PLARE experimental results (success Rate) and example of real-world robot experiment setup>

 

This research provides a more efficient and practical approach to enabling robots to understand and act upon human language instructions by leveraging large vision-language models—bringing us a step closer to the realization of Physical AI. Moving forward, Professor Changdong Yoo’s team plans to continue advancing research in robot control, vision-language-based interaction, and scalable feedback learning to further develop key technologies in Physical AI.