Single-Stage 3D Pose Estimation of Vulnerable Road Users Using Pseudo-Labels

Windbacher, Fabian; Hödlmoser, Michael; Gelautz, Margrit

doi:10.1007/978-3-031-31438-4_27

Record link:

http://hdl.handle.net/20.500.12708/188289

Title:

Single-Stage 3D Pose Estimation of Vulnerable Road Users Using Pseudo-Labels

Citation:

Windbacher, F., Hödlmoser, M., & Gelautz, M. (2023). Single-Stage 3D Pose Estimation of Vulnerable Road Users Using Pseudo-Labels. In R. Gade, M. Felsberg, & J.-K. Kämäräinen (Eds.), Image Analysis. 22nd Scandinavian Conference, SCIA 2023, Sirkka, Finland, April 18–21, 2023, Proceedings, Part II (pp. 401–417). Springer. https://doi.org/10.1007/978-3-031-31438-4_27

Publisher DOI:

10.1007/978-3-031-31438-4_27

Publication Type:

Inproceedings - Full-Paper Contribution

Language:

English

Authors:

Windbacher, Fabian
Hödlmoser, Michael
Gelautz, Margrit

Organisational Unit:

E193-01 - Forschungsbereich Computer Vision
E193 - Institut für Visual Computing and Human-Centered Technology

Published in:

Image Analysis. 22nd Scandinavian Conference, SCIA 2023, Sirkka, Finland, April 18–21, 2023, Proceedings, Part II

ISBN:

978-3-031-31437-7

Volume:

13886

DOI of the book:

10.1007/978-3-031-31438-4

Date (published):

27-Apr-2023

Event name:

SCIA 2023. Scandinavian Conference on Image Analysis

Event date:

18-Apr-2023 - 21-Apr-2023

Event place:

Sirkka, Finland

Number of Pages:

Publisher:

Springer, Cham

Peer reviewed:

Yes

Keywords:

computer vision; machine learning; human pose estimation; autonomous driving; RGB images; LiDAR; evaluation

Abstract:

Human pose estimation of vulnerable road users is an important perception task for autonomous vehicles which can be exploited for intention prediction in order to guide the vehicle’s actions. Single-stage human pose estimation approaches with their potential in terms of simplicity and efficiency have shown only mediocre results in 2D, and have hardly been investigated in 3D in the autonomous driving domain so far. We tackle this challenge with the 2D single-stage human pose estimator KAPAO. We find that KAPAO achieves state-of-the-art performance in our evaluation on domain-specific 2D benchmark datasets, which motivates its extension for application in 3D. To overcome a lack of ground truth vulnerable road user data for 3D pose estimation, we first extend the Waymo Open Dataset with additional 3D pseudo-labels. We create more than one million 3D poses, that we estimate using the dataset’s exhaustive person bounding boxes and associated LiDAR point clouds. Evaluating their quality, we report a mean per joint position error of less than 10 cm. Having access to large-scale domain-specific 3D pose data, we propose a 3D variant of KAPAO that additionally predicts the depths of joints. We evaluate it on our extended Waymo Open Dataset and compare its performance to that of a LiDAR uplifting baseline. The proposed approach is low-latency and produces plausible poses but struggles to estimate absolute depth precisely, particularly at large distances. We alleviate that limitation by implementing a conditional LiDAR-based depth correction.

Project title:

Multimodales Sensor-Lichtsystem zum Schutz von verletzlichen Verkehrsteilnehmern: 879642 (FFG - Österr. Forschungsförderungs- gesellschaft mbH)

Research Areas:

Visual Computing and Human-Centered Technology: 100%

Science Branch:

1020 - Informatik: 100%

Appears in Collections:

Conference Paper

Show full item record

Page view(s)

435

checked on Nov 23, 2023

Download(s)

checked on Nov 23, 2023

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM