
<(From left) Ph.D. candidate Taewoong Kang, Ph.D candidate Junha Hyung, Professor Jaegul Choo, and Ph.D. candidate Minho Park (From top right square, from left), Ph.D. candidate Kinam Kim, Seoul National University undergraduate researcher Dohyeon Kim>
What if, while watching The Dark Knight, you weren’t just observing the Joker on screen, but actually seeing Gotham City through his eyes? The video technology that allows viewers to experience the world through a character’s perspective, rather than as a mere observer, is becoming a reality. Researchers at our university have developed a new AI model that generates first-person viewpoint videos from standard footage.
KAIST announced on February 23rd that Professor Jaegul Choo’s research team at the Kim Jaechul Graduate School of AI has developed ‘EgoX,’ an AI model that utilizes observer-perspective (exocentric) video to precisely generate the scenes that a person in the video would actually be seeing.
With the rapid advancement of Augmented Reality (AR), Virtual Reality (VR), and AI robotics, the importance of “egocentric video”—which captures scenes as one directly sees them—is growing. However, obtaining high-quality first-person footage previously required users to wear expensive action cameras or smart glasses. Furthermore, there were significant technical limitations in naturally converting existing standard (third-person or exocentric) video into a first-person perspective.
A key feature of this technology is that it goes beyond simply rotating the screen; it comprehensively understands the person’s position, posture, and the 3D structure of the surrounding space to reconstruct the first-person viewpoint.

< Example of converting a third-person perspective video into a first-person perspective video >
Existing technologies often only converted still images or required footage from four or more cameras. Additionally, they frequently suffered from awkward visual artifacts in videos with complex lighting or rapid movement.
In contrast, EgoX can generate high-quality first-person video from just a single third-person video source. Specifically, the research team succeeded in realistically implementing natural shifts in vision—such as when a person turns their head—by precisely modeling the correlation between head movement and the actual field of view.
This technology demonstrated stable performance across various daily scenarios, including cooking, exercising, and working, without being limited to specific environments. It is being evaluated as a breakthrough that opens new possibilities for securing high-quality first-person data from existing video archives without the need for wearable devices.
EgoX is expected to have a significant impact across various industries. In the fields of AR, VR, and the Metaverse, it can maximize user experience by transforming standard videos into immersive content that makes users feel as if they are experiencing the scene firsthand.
Furthermore, it is projected to contribute to the fields of robotics and AI training by serving as core data for “Imitation Learning,” where robots learn by watching human actions. New types of video services, such as switching sports broadcasts or vlogs to the perspective of the athlete or the protagonist, are also anticipated.

< EgoX technology that converts a third-person perspective into a first-person perspective (AI-generated image) >
Distinguished Professor Jaegul Choo stated, “This research is significant in that AI has moved beyond simple video conversion to learning and reconstructing human ‘vision’ and ‘spatial understanding.’ We expect an environment to open up where anyone can create and experience immersive content using only previously recorded videos.” He added, “KAIST will continue to secure global competitiveness in the field of generative AI-based video technology.”
This research was led by first authors Taewoong Kang, Kinam Kim, and Dohyeon Kim . The paper was pre-released on arXiv on December 9, 2025, garnering significant attention from AI industry giants like NVIDIA and Meta, as well as academia. It is scheduled for official presentation at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), an international academic conference to be held in Colorado, USA, on June 3, 2026.
Paper Title: EgoX: Egocentric Video Generation from a Single Exocentric Video Paper Link: https://keh0t0.github.io/EgoX/
Meanwhile, this research was supported by the Ministry of Science and ICT through the National Research Foundation of Korea’s individual basic research project, “Research on User-Centered Content Generation and Editing Technology through Generative AI,” and the Supercomputer No. 5 High-Performance Computing-based R&D Innovation Support project, “Research on Video Filming Viewpoint Conversion Based on Diffusion Models.”
