KAIST EE

School of Electrical Engineering We thrive
to be the world’s
top IT powerhouse.We thrive to be the world’s top IT powerhouse.

Our mission is to lead innovations
in information technology, create lasting impact,
and educate next-generation leaders of the world.

Learn More

School of Electrical Engineering We thrive
to be the world’s
top IT powerhouse.We thrive to be the world’s top IT powerhouse.

Our mission is to lead innovations
in information technology, create lasting impact,
and educate next-generation leaders of the world.

Learn More

School of Electrical Engineering We thrive
to be the world’s
top IT powerhouse.We thrive to be the world’s top IT powerhouse.

Our mission is to lead innovations
in information technology, create lasting impact,
and educate next-generation leaders of the world.

Learn More

School of Electrical Engineering We thrive
to be the world’s
top IT powerhouse.We thrive to be the world’s top IT powerhouse.

Our mission is to lead innovations
in information technology, create lasting impact,
and educate next-generation leaders of the world.

Learn More

School of Electrical Engineering We thrive
to be the world’s
top IT powerhouse.We thrive to be the world’s top IT powerhouse.

Our mission is to lead innovations
in information technology, create lasting impact,
and educate next-generation leaders of the world.

Learn More

AI in EE AI and machine learning
are a key thrust
in EE researchAI and machine learning are a key thrust in EE research

AI/machine learning efforts are already a big part of ongoing
research in all 6 divisions - Computer, Communication, Signal,
Wave, Circuit and Device - of KAIST EE

Learn More

Previous slide

Next slide

Prof.
Jae-Woong Jeong’s
Team Develops
Phase-Change Metal Ink

Previous slide

Next slide

Prof.
Jae-Woong Jeong’s
Team Develops
Phase-Change Metal Ink

Prof.
Kyeongha Kwon’s Team
Enables Battery-Free
CO₂ Monitoring

Prof.
Yongdae Kim · Insu Yun’s
Team Uncovers
Risks in Mandatory
KSA Tools

Prof. Hyun Myung’s Team
Wins Global Robotics Challenge

Prof.
Sung-Ju Lee's Team
Develops ‘Amuse’

an AI Songwriting Companion

Prof.
Hyunchul Shim’s Team
Wins 3rd Place
at A2RL Autonomous
Drone Racing Competition

Prof. Minsoo Rhu’s team
develops a simulation
framework called vTrain

Prof. Jun-Bo Yoon’s Team
Achieves Human-Level Tactile Sensing with
Breakthrough Pressure Sensor

Prof.
Seungwon Shin’s Team
Validates Cyber Risks
of LLMs

Prof. Seunghyup Yoo’s team Develops
Wearable Carbon Dioxide Sensor

to Enable Real-time Apnea Diagnosis

Previous slide

Next slide

Highlights

AWARD

EE Professor Jung-Woo Choi’s Research Team Wins the IEEE DCASE 2025 Challenge, the World’s Leading Acoustic AI Competition

교수님팀 750 — <(Left to right) Younghoo Kwon (Integrated Master’s and Ph.D. program), Dohwan Kim (Master’s program), Professor Jung-Woo Choi, Dongheon Lee (Ph.D.)>

Acoustic source separation and classification is a key next-generation AI technology for early detection of anomalies in drone operations piping faults or border surveillance and for enabling spatial audio editing in AR VR content production.

Professor Jung-Woo Choi’s research team from the School of Electrical Engineering won first place in the “Spatial Semantic Segmentation of Sound Scenes” task of the “IEEE DCASE Challenge 2025.”

This year’s challenge featured 86 teams competing across six tasks. In their first-ever participation, KAIST’s team ranked first in Task 4: Spatial Semantic Segmentation of Sound Scenes—a highly demanding task requiring the analysis of spatial information in multi-channel audio signals with overlapping sound sources. The goal was to separate individual sounds and classify them into 18 predefined categories. The team, composed of Dr. Dongheon Lee, integrated MS-PhD student Younghoo Kwon, and MS student Dohwan Kim, will present their results at the DCASE Workshop in Barcelona this October.

Earlier this year, Dr. Dongheon Lee developed a state-of-the-art sound source separation AI combining Transformer and Mamba architectures. Furthermore, at the challenge, led by Younghoo Kwon, the team established the chain-of-inference architecture that first separates waveforms and source types and then refines the estimation by utilizing the estimated waveforms and classes as clues for target signal extraction in the next stage.

1. 여러 소리가 혼합된 음향 장면의 예 — < Figure 1. Example of an acoustic scene with multiple mixed sounds >

This chain-of-inference approach is inspired by human’s auditory scene analysis mechanism that isolates individual sounds by focusing on incomplete clues such as sound type, rhythm, or direction.

In the evaluation metric CA-SDRi (Class-aware Signal-to-distortion Ratio improvement)*, the team was the only participant to achieve a double-digit improvement of 11 dB, demonstrating their technical excellence. *CA-SDRi (Class-aware Signal-to-distortion Ratio improvement) measures how much clearer and less distorted the target sound is compared with the original mix.

Professor Choi remarked, “I am proud that our team’s world leading acoustic separation AI models over the past three years have now received formal recognition. Despite the greatly increased difficulty and the limited development window due to other conference schedules and final exams, each member demonstrated focused research that led to first place.”

2. 혼합 음원으로부터 분리된 음원들의 시간 주파수 패턴 — < Figure 2. Time frequency patterns of separated sound sources >

The “IEEE DCASE Challenge 2025” was held online from April 1^st to June 15^th for submissions, with results announced on June 30^th. Since its inception in 2013 under the IEEE Signal Processing Society, the challenge has served as a global stage for AI models in the acoustic field.

Go to the IEEE DCASE Challenge 2025 website (Click)

This research was supported by the National Research Foundation of Korea’s Mid-Career Researcher Program and STEAM Research Project, funded by the Ministry of Education, and the Future Defense Research Center, funded by the Defense Acquisition Program Administration and the Agency for Defense Development.

3. 연구진이 개발한 음향의 분리 및 분류 AI 구조 — < Figure 3. AI architecture for sound separation and classification >

images 000102 image333.png — < Competition Results Rankings. Higher CA-SDRi indicates a better score (Unit: decibels dB) >

AWARD

Ph.D. candidate Se Jin Park from Professor Yong Man Ro’s lab develops ‘SpeechSSM,’ opening up possibilities for a 24-hour AI voice assistant

교수님 750 — <(From Left)Prof. Yong Man Ro and Ph.D. candidate Sejin Park>

Recently, Spoken Language Models (SLMs) have been spotlighted as next-generation technology that surpasses the limitations of text-based language models by learning human speech without text to understand and generate linguistic and non-linguistic information. However, existing models showed significant limitations in generating long-duration content required for podcasts, audiobooks, and voice assistants. Now, KAIST researcher has succeeded in overcoming these limitations by developing ‘SpeechSSM,’ which enables consistent and natural speech generation without time constraints.

Ph.D. candidate Sejin Park from Professor Yong Man Ro’s research team in the School of Electrical Engineering has developed ‘SpeechSSM,’ a spoken language model capable of generating long-duration speech.

1. SpeechSSM 개요 — <Figure 1. Overview of SpeechSSM. The hybrid state-space model of SpeechSSM is trained with a language modeling objective on semantic tokens (USM-v2) that are encoded using overlapping fixed-size windows. The non-autoregressive speech decoder (SoundStorm) converts these overlapping semantic token windows into acoustic codec tokens (SoundStream), conditioned on speaker identity.>

A major advantage of Spoken Language Models (SLMs) is their ability to directly process speech without intermediate text conversion, leveraging the unique acoustic characteristics of human speakers, allowing for the rapid generation of high-quality speech even in large-scale models.

However, existing models faced difficulties in maintaining semantic and speaker consistency for long-duration speech due to increased ‘speech token resolution’ and memory consumption when capturing very detailed information by breaking down speech into fine fragments.

To solve this problem, Se Jin Park developed ‘SpeechSSM,’ a spoken language model using a Hybrid State-Space Model, designed to efficiently process and generate long speech sequences.

This model employs a ‘hybrid structure’ that alternately places ‘attention layers’ focusing on recent information and ‘recurrent layers’ that remember the overall narrative flow (long-term context). This allows the story to flow smoothly without losing coherence even when generating speech for a long time. Furthermore, memory usage and computational load do not increase sharply with input length, enabling stable and efficient learning and the generation of long-duration speech.

SpeechSSM effectively processes unbounded speech sequences by dividing speech data into short, fixed units (windows), processing each unit independently, and then combining them to create long speech.

Additionally, in the speech generation phase, it uses a ‘Non-Autoregressive’ audio synthesis model (SoundStorm), which rapidly generates multiple parts at once instead of slowly creating one character or one word at a time, enabling the fast generation of high-quality speech.

While existing models typically evaluated short speech models of about 10 seconds, Se Jin Park created new evaluation tasks for speech generation based on their self-built benchmark dataset, ‘LibriSpeech-Long,’ capable of generating up to 16 minutes of speech.

Compared to PPL (Perplexity), an existing speech model evaluation metric that only indicates grammatical correctness, she proposed new evaluation metrics such as ‘SC-L (semantic coherence over time)’ to assess content coherence over time, and ‘N-MOS-T (naturalness mean opinion score over time)’ to evaluate naturalness over time, enabling more effective and precise evaluation.

2. 다양한 음성 언어 모델에서 고려된 최대 시퀀스 길이 — < Figure 2. Maximum sequence length considered in various Spoken Language Models (SLMs).
Whereas conventional SLMs have been trained and evaluated on sequences up to 200 seconds in length, SpeechSSM is capable of training and evaluating speech up to 16 minutes. While the proposed model can theoretically generate speech of infinite length with constant memory usage, the experiments were limited to 16 minutes for evaluation purposes.>

Through these new evaluations, it was confirmed that speech generated by the SpeechSSM spoken language model consistently featured specific individuals mentioned in the initial prompt, and new characters and events unfolded naturally and contextually consistently, despite long-duration generation. This contrasts sharply with existing models, which tended to easily lose their topic and exhibit repetition during long-duration generation.

3. 임베딩 유사도를 이용해 측정한 10초 프롬프트와 16분 생성 결과의 의미 유사도 — < Figure 3. Semantic similarity between a 10-second prompt and each 100-word segment of 16-minute generated speech, measured using embedding similarity (SC-L). Unlike prior models whose semantic consistency degrades as the length of generated speech increases, SpeechSSM maintains semantic coherence over long durations, exhibiting trends similar to real human speech.>

PhD candidate Sejin Park explained, “Existing spoken language models had limitations in long-duration generation, so our goal was to develop a spoken language model capable of generating long-duration speech for actual human use.” She added, “This research achievement is expected to greatly contribute to various types of voice content creation and voice AI fields like voice assistants, by maintaining consistent content in long contexts and responding more efficiently and quickly in real time than existing methods.”

This research, with Se Jin Park as the first author, was conducted in collaboration with Google DeepMind and is scheduled to be presented as an oral presentation at ICML (International Conference on Machine Learning) 2025 on July 16th.

Paper Title: Long-Form Speech Generation with Spoken Language Models
DOI: 10.48550/arXiv.2412.18603

Ph.D. candidate Se Jin Park has demonstrated outstanding research capabilities as a member of Professor Yong Man Ro’s MLLM (multimodal large language model) research team, through her work integrating vision, speech, and language. Her achievements include a spotlight paper presentation at 2024 CVPR (Computer Vision and Pattern Recognition) and an Outstanding Paper Award at 2024 ACL (Association for Computational Linguistics).

images 000101 image4.jpg 2 — <Figure 4. Computational Efficiency of SpeechSSM. (Left) Maximum batch decoding throughput by model and generation length on TPU v5e. (Right) Time taken to decode a single sample (batch size 1) up to the target length on TPU v5e.>

For more information, you can refer to the publication and accompanying demo: SpeechSSM Publications.

AWARD

Three Graduate Students from the School of Electrical Engineering Selected as Recipients of the 2nd Presidential Science Scholarship for Graduate Students

입력해주세요 001 2 enhancer — <Left to right: Ph.D. students Kyeongha Rho, Seokjun Park, and Juntaek Lim>

Three Ph.D. students from EE—Kyeongha Rho (advisor: Joon Son Chung), Seokjun Park (advisor: Jinseok Choi), and Juntaek Lim (advisor: Minsoo Rhu)—have been selected as recipients of the 2nd Presidential Science Scholarship for Graduate Students.

Kyeongha Rho is conducting research on multimodal self-supervised learning, as well as multimodal perception and generation models. Seokjun Park is currently researching optimization techniques for low-power beamforming in satellite and multi-access systems for next-generation 6G wireless communications, as well as predictive beamforming utilizing artificial intelligence in integrated sensing and communication (ISAC) systems. Juntaek Lim focuses on developing high-performance, secure computing systems by integrating security across both hardware and software.

The Presidential Science Scholarship for Graduate Students is a new initiative launched last year by the Korea Student Aid Foundation to foster world-class research talent in science and engineering. Final awardees receive a certificate of scholarship in the name of the President, along with financial support—KRW 1.5 million per month (KRW 18 million annually) for master’s students and KRW 2 million per month (KRW 24 million annually) for Ph.D. students.

This year’s selection process for the scholarship was highly competitive, with 2,355 applicants vying for 120 spots, resulting in a competition ratio of approximately 20:1.

NEWS

Professor Young Min Song Joins Our Faculty

송영민 교수 enhancer — 〈Professor Young Min Song>

We are pleased to announce that Professor Young Min Song will be joining the KAIST School of Electrical Engineering as of July 1, 2025. Congratulations!

Professor Song’s temporary office is located in Room 1410, Saenel-dong (E3-4). His primary research areas include flexible optoelectronic devices and nanophotonics. He is actively working on biomimetic cameras for intelligent robotics, opto-neuromorphic devices and systems, nanophotonics-based reflective displays, and radiative cooling devices through infrared control. For more details on his research, please visit his homepage.

Homepage: https://www.ymsong.net

Click here to read Professor Song’s recent interview in Nature

AWARD

EE Prof. Dong Eui Chang’s Team Wins Third Prize at Hugging Face LeRobot Worldwide Hackathon for Developing VLA-Based Collaborative Robot Object Transfer System

900 — <(From left) Master’s Student Kyeongdon Lee, Hojun Kwon, Seokjoon Kwon, Professor Dong Eui Chang, PhD Student Hee-deok Jang, Master’s Student Guining Pertin>

‘Team ACE’ from Professor Dong Eui Cang’s lab in our department achieved outstanding results by winning a Third Prize at the ‘Hugging Face LeRobot Worldwide Hackathon’, held over three days from June 14 to 16.

Composed of Seokjoon Kwon (Master’s Program, Team Leader), Hee-Deok Jang (Ph.D. Program), Hojun Kwon (Master’s Program), Guining Pertin (Master’s Program), and Kyeongdon Lee (Master’s Program) from Professor Dong Eui Chang’s lab, ‘Team ACE’ developed a VLA-based collaborative robot object transfer system and placed 20th out of more than 600 teams worldwide, earning a Third Prize (awarded to teams ranked 6th-24th). In addition, the team also received the KIRIA President’s Award (awarded by the Korea Institute for Robot Industry Advancement) from the local organizing committee in Daegu, South Korea.

‘Hugging Face’ is a U.S.-based AI startup known as one of the world’s largest platforms for artificial intelligence, offering widely used machine learning libraries such as Transformers and Datasets. More recently, the company has also been actively providing AI resources for robotics applications.

Hugging Face regularly hosts global hackathons that bring together researchers and students from around the world to compete and collaborate on innovative AI-driven solutions.

This year’s ‘LeRobot Worldwide Hackathon’ gathered over 2,500 AI and robotics experts from 45 countries. Participants were challenged to freely propose and implement solutions to real-world problems in industry and everyday life by applying technologies such as VLA (Vision Language Action) models and reinforcement learning to robotic arms.

Through their achievement in the competition, ‘Team ACE’ was recognized for their technical excellence and creativity by both the global robotics community and experts in South Korea.

The team’s performance at the competition drew considerable attention from local media and was actively reported in regional news outlets.

AWARD

Professors Myoungsoo Jung’s Team Reveals Their Works on Next-Generation Interconnect/Semiconductor Technologies in IEEE Micro.

교수님 연구팀 — 〈Professor Myoungsoo Jung’s Research Team 〉

Professor Myoungsoo Jung’s research team is going to reveal their works on next-generation interconnect / semiconductor technologies in IEEE Micro, a leading journal in computer architecture.

IEEE Micro, established in 1981, is a bimonthly publication featuring recent advances in computer architecture and semiconductor technologies. Professor Jung’s team will present a total of five papers in the upcoming May-June issue on “Cache Coherent Interconnects and Resource Disaggregation Technology”.

Among them, three papers focus on applying Compute Express Link (CXL) to storage systems. Especially, the team introduces solutions to improve the performance of CXL-SSD, a concept for which Professor Jung suggested a practical implementation in early 2022. These technologies enable memory expansion through large-capacity SSDs while providing performance comparable to DRAM.

<IEEE Micro 5/6월호를 통해 공개되는 논문 목록> — 〈List of Papers to Be Published in IEEE Micro〉

The team also explored storage architectures incorporating In-Storage Processing (ISP), which performs computation directly inside the storage pool. By processing data within storage, this approach reduces data movement and thereby improves efficiency in large-scale applications such as large language models (LLMs).

These papers, conducted in collaboration with the faculty-led startup Panmnesia, will be published through IEEE Micro’s official website and in its regular print issue.

※ Click on the title to view the early access link.

Early Access Link #1: From Blocks to Byte: Transforming PCIe SSDs with CXL Memory Protocol and Instruction Annotation
Early Access Link #2: CXL Topology-Aware and Expander-Driven Prefetching: Unlocking SSD Performance
Early Access Link #3: CXL-GPU: Pushing GPU Memory Boundaries with the Integration of CXL Technologies
Early Access Link #4: Containerized In-Storage Processing and Computing-Enabled SSD Disaggregation
Early Access Link #5: Efficient Disaggregated Cloud Storage for Cold Videos with Neural Enhancement

AWARD

Song-I Cheon, Ph.D. Candidate in Prof. Minkyu Je’s Lab, Receives IEEE CASS Pre-Doctoral Grant

Song-I Cheon, a Ph.D. student in the Department of Electrical and Electronic Engineering under the supervision of Professor Minkyu Je, has been selected as a recipient of the 2025 IEEE Circuits and Systems Society (CASS) Pre-Doctoral Grant.

This prestigious grant is awarded annually to a select number of doctoral students worldwide who have demonstrated outstanding research potential in the field of system semiconductor circuit design. In 2025, only four students globally were selected, with Song-I being one of the recipients.

Song-I has published a total of 19 papers in international journals and conferences, including one paper at ISSCC (as co-first author) and four in IEEE journals (as first or co-first author). Notably, nine of these papers were presented at journals and conferences sponsored by IEEE CASS. Her research contributions in impedance measurement circuit design—particularly in the areas of optimization, high accuracy, and low power consumption—were highly recognized in the selection process.

AWARD

Professor Youngchul Sung’s Lab’s PhD Student Jeonghye Kim Contributes to Industrial Site Optimization with AI

Alongside large language models, autonomous driving and humanoid robots, AI-driven optimization of industrial manufacturing has emerged as a major application of AI. In 2024, Kim Jeong-hye, a PhD student of Professor Youngchul Sung, interned on LG AI Research’s reinforcement-learning team, where she tackled a range of process-optimization challenges across LG Group’s production facilities.

That team applied optimization majorly based on reinforcement learning to LG Chem’s Daesan plant’s naphtha-cracking facility (NCC), improving production efficiency by 3%, far beyond of 0.1% of initial expectation, and yielding an extra KRW 10 billion in annual profit for that plant alone. Because training reinforcement learning agents via on-line interaction in a production environment is impractical, such optimization typically relies on offline reinforcement learning, which optimizes policies with pre-collected data.

Jeonghye contributed to the development of PARS, a novel offline reinforcement learning algorithm that significantly outperforms existing methods. By enhancing the neural network’s feature resolution with reward scaling with layer normalization, this new approach better differentiates between in-sample and out-of-distribution data, eliminating Q-value divergence, the core issue of off-line reinforcement learning. This advancement promises to accelerate future production-process optimizations as well as many RL applications with difficulty in on-line environment interaction.

This research result will be presented as a Spotlight paper at the International Conference on Machine Learning (ICML) 2025.

Related Yonhap News article: https://www.yna.co.kr/view/AKR20250613153400003

NOTICE

Academic

General

Academic

General

SCHEDULE

CLICK

SEMINAR

Pathological Findings of Breast and Thyroid Cancer for Innovation in Ultrasound Diagnostics

Date:

2025. 05. 28.(WED), 3pm

Highlights

NOTICE

SCHEDULE

SEMINAR

Date:

Speaker:

Professor Yong-Jin Kim

Place:

School of Electrical Engineering(E3-2), 2219

Date:

Speaker:

Prof. Nam Sung Kim

Place:

Wooribyul Seminar Room 2201, E3-2, KAIST

Date:

Speaker:

Istvan Szerdahelyi(ㅆhe Ambassador of Hungary to the Republic of Korea)

Place:

KI Building(E4), Matrix Hall(2nd Floor)

전임교원 채용

Full-Time Faculty Recruitment