<div class="csl-bib-body">
<div class="csl-entry">Nguyen, H. H., Vu, M. N., Beck, F., Ebmer, G., Nguyen, A., Kemmetmüller, W., & Kugi, A. (2025). Language-driven closed-loop grasping with model-predictive trajectory optimization. <i>Mechatronics</i>, <i>109</i>, Article 103335. https://doi.org/10.1016/j.mechatronics.2025.103335</div>
</div>
-
dc.identifier.issn
0957-4158
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/215629
-
dc.description.abstract
Combining a vision module inside a closed-loop control system for the seamless movement of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a modular zero-shot framework for language-driven manipulation of (dynamic) objects through a closed-loop control system with real-time trajectory replanning and an online 6D object pose localization. We segment an object within 0.5 s by leveraging a vision language model via language commands. Then, guided by natural language commands, a closed-loop system, including a unified pose estimation and tracking and online trajectory planning, is utilized to continuously track this object and compute the optimal trajectory in real time. Our proposed zero-shot framework provides a smooth trajectory that avoids jerky movements and ensures the robot can grasp a non-stationary object. Experimental results demonstrate the real-time capability of the proposed zero-shot modular framework to accurately and efficiently grasp moving objects. The framework achieves update rates of up to 30 Hz for the online 6D pose localization module and 10 Hz for the receding-horizon trajectory optimization. These advantages highlight the modular framework’s potential applications in robotics and human–robot interaction; see the video at language-driven-grasping.github.io.
en
dc.language.iso
en
-
dc.publisher
PERGAMON-ELSEVIER SCIENCE LTD
-
dc.relation.ispartof
Mechatronics
-
dc.subject
language-driven grasp detection
en
dc.subject
pose estimation
en
dc.subject
grasping
en
dc.subject
Trajectory optimization
en
dc.title
Language-driven closed-loop grasping with model-predictive trajectory optimization
en
dc.type
Article
en
dc.type
Artikel
de
dc.contributor.affiliation
Austrian Institute of Technology, Austria
-
dc.contributor.affiliation
University of Liverpool, United Kingdom of Great Britain and Northern Ireland (the)