TrackAgent: 6D Object Tracking via Reinforcement Learning

Röhrl, Konstantin; Bauer, Dominik; Patten, Timothy Michael; Vincze, Markus

doi:10.1007/978-3-031-44137-0_27

Record link:

http://hdl.handle.net/20.500.12708/194649

Title:

TrackAgent: 6D Object Tracking via Reinforcement Learning

Citation:

Röhrl, K., Bauer, D., Patten, T. M., & Vincze, M. (2023). TrackAgent: 6D Object Tracking via Reinforcement Learning. In Computer Vision Systems: 14th International Conference, ICVS 2023, Vienna, Austria, September 27–29, 2023, Proceedings (pp. 323–335). https://doi.org/10.1007/978-3-031-44137-0_27

Publisher DOI:

10.1007/978-3-031-44137-0_27

Publication Type:

Inproceedings - Full-Paper Contribution

Language:

English

Authors:

Röhrl, Konstantin
Bauer, Dominik
Patten, Timothy Michael
Vincze, Markus

Organisational Unit:

E376-02 - Forschungsbereich Komplexe Dynamische Systeme

Published in:

Computer Vision Systems: 14th International Conference, ICVS 2023, Vienna, Austria, September 27–29, 2023, Proceedings

ISBN:

978-3-031-44137-0

Volume:

14253

Date (published):

2023

Event name:

International Conference on Computer Vision Systems (ICVS 2023)

Event date:

27-Sep-2023 - 29-Sep-2023

Event place:

Wien, Austria

Number of Pages:

Peer reviewed:

Yes

Keywords:

3D Vision; Object Pose Tracking; Reinforcement Learning; Robotics

Abstract:

Tracking an object’s 6D pose, while either the object itself or the observing camera is moving, is important for many robotics and augmented reality applications. While exploiting temporal priors eases this problem, object-specific knowledge is required to recover when tracking is lost. Under the tight time constraints of the tracking task, RGB(D)-based methods are often conceptionally complex or rely on heuristic motion models. In comparison, we propose to simplify object tracking to a reinforced point cloud (depth only) alignment task. This allows us to train a streamlined approach from scratch with limited amounts of sparse 3D point clouds, compared to the large datasets of diverse RGBD sequences required in previous works. We incorporate temporal frame-to-frame registration with object-based recovery by frame-to-model refinement using a reinforcement learning (RL) agent that jointly solves for both objectives. We also show that the RL agent’s uncertainty and a rendering-based mask propagation are effective reinitialization triggers.

Research Areas:

Automation and Robotics: 100%

Science Branch:

2020 - Elektrotechnik, Elektronik, Informationstechnik: 100%

Appears in Collections:

Conference Paper

Show full item record

Page view(s)

checked on Feb 28, 2024

Download(s)

checked on Feb 28, 2024

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM