CAGT: Sim-to-Real Depth Completion with Interactive Embedding Aggregation and Geometry Awareness for Transparent Objects

Jing, Xingshuo; Qian, Kun; Vincze, Markus

doi:10.1109/TCSVT.2025.3543288

Record link:

http://hdl.handle.net/20.500.12708/215626
https://doi.org/10.34726/9499

Title:

CAGT: Sim-to-Real Depth Completion with Interactive Embedding Aggregation and Geometry Awareness for Transparent Objects

Citation:

Jing, X., Qian, K., & Vincze, M. (2025). CAGT: Sim-to-Real Depth Completion with Interactive Embedding Aggregation and Geometry Awareness for Transparent Objects. IEEE Transactions on Circuits and Systems for Video Technology. https://doi.org/10.34726/9499

reposiTUm DOI:

10.34726/9499

Publisher DOI:

10.1109/TCSVT.2025.3543288

CatalogPlus:

AC17580857

Publication Type:

Article - Original Research Article

Language:

English

Authors:

Jing, Xingshuo
Qian, Kun
Vincze, Markus

Organisational Unit:

E376-02 - Forschungsbereich Komplexe Dynamische Systeme

Journal:

IEEE Transactions on Circuits and Systems for Video Technology

ISSN:

1051-8215

Date (published):

2025

Number of Pages:

Publisher:

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Peer reviewed:

Yes

Keywords:

contrastive learning; depth completion; geometry-aware; Sim-to-real

Abstract:

Robust depth completion of transparent objects would be beneficial for industrial automation such as vision-based robotic grasping and manipulation. However, although some methods try to learn a compact intra-layer feature representation with the boost of the attention mechanism or the vision Transformer, they ignore the neglected corner regions and sparse geometry information that are important for accurate depth completion. To tackle these issues, we propose a novel sim-to-real transferable model, named CAGT, with interactive embedding aggregation and geometry awareness to reconstruct severely sparse depth maps of transparent objects in this paper. We design a Depth-clue Interaction Aggregation Module (DIAM) to enhance the Transformer's ability to extract boundary corner features and thus supplement depth clues. Then, we propose a Geometric Information Augmentation Module (GIAM) to fuse the geometry-aware feature containing shape and surface details. Moreover, we introduce a contrastive learning mechanism to facilitate the sim-to-real generalization of the completion model. Extensive experiment results on two challenging datasets, ClearGrasp and TransCG, demonstrate that our proposed CAGT can obtain superior performance over the state-of-the-art methods. We also demonstrate that CAGT can improve the grasp accuracy of transparent objects by a robotic grasping generalization experiment.

Research Areas:

Automation and Robotics: 100%

Science Branch:

2020 - Elektrotechnik, Elektronik, Informationstechnik: 100%

License:

In Copyright

Appears in Collections:

Article