<div class="csl-bib-body">
<div class="csl-entry">Stanovcic, S. (2026). <i>Real-World Human-Robot Interaction Behavior Generation using Latent Diffusion Models</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2026.123206</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2026.123206
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/228390
-
dc.description
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft
-
dc.description.abstract
Natural social interactions involve two agents exhibiting smooth and diverse behaviors that align with each other's intent in real time. Creating this level of expressiveness in human–robot interaction (HRI) requires a robot to go beyond simple reactive behaviors and instead anticipate the rich distribution of possible human actions, enabling responses that are diverse, human-like, and socially aligned. This thesis bridges the gap between complex generative modeling and actual robotic deployment by integrating visual perception, context-aware motion generation, and physical-hardware execution into a single coherent system. At the core of the system lies a latent diffusion framework designed for the joint generation of two-person social interactions. Given past context and a high-level interaction description, our model generates potential future motions for both agents in an interdependent manner. By operating within a temporally coherent latent space, the framework ensures smooth, aligned motion segments while significantly reducing computational overhead to support live interaction. To achieve real-time generation, the model is integrated into a continuous streaming pipeline that combines chunked diffusion inference with real-time SMPL-X pose estimation from a single RGBD camera, eliminating the need for restrictive motion capture systems and enabling continuous prediction from live human input. The framework is demonstrated both in simulation and through real-world experiments with Tiago++ and Unitree G1 robots, with generated reactor motion retargeted online to each platform's embodiment. Ultimately, this thesis provides a robust solution for diverse and responsive motion generation, advancing the development of socially aware robots capable of engaging with humans naturally and adaptively under realistic conditions.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Human Robot Intergation
en
dc.subject
Robot Behavior Generation
en
dc.subject
Deep Learning
en
dc.subject
Diffusion Models
en
dc.title
Real-World Human-Robot Interaction Behavior Generation using Latent Diffusion Models