Windbacher, F., Hödlmoser, M., & Gelautz, M. (2023). Single-Stage 3D Pose Estimation of Vulnerable Road Users Using Pseudo-Labels. In R. Gade, M. Felsberg, & J.-K. Kämäräinen (Eds.), Image Analysis. 22nd Scandinavian Conference, SCIA 2023, Sirkka, Finland, April 18–21, 2023, Proceedings, Part II (pp. 401–417). Springer. https://doi.org/10.1007/978-3-031-31438-4_27
Human pose estimation of vulnerable road users is an important perception task for autonomous vehicles which can be exploited for intention prediction in order to guide the vehicle’s actions. Single-stage human pose estimation approaches with their potential in terms of simplicity and efficiency have shown only mediocre results in 2D, and have hardly been investigated in 3D in the autonomous driving domain so far. We tackle this challenge with the 2D single-stage human pose estimator KAPAO. We find that KAPAO achieves state-of-the-art performance in our evaluation on domain-specific 2D benchmark datasets, which motivates its extension for application in 3D. To overcome a lack of ground truth vulnerable road user data for 3D pose estimation, we first extend the Waymo Open Dataset with additional 3D pseudo-labels. We create more than one million 3D poses, that we estimate using the dataset’s exhaustive person bounding boxes and associated LiDAR point clouds. Evaluating their quality, we report a mean per joint position error of less than 10 cm. Having access to large-scale domain-specific 3D pose data, we propose a 3D variant of KAPAO that additionally predicts the depths of joints. We evaluate it on our extended Waymo Open Dataset and compare its performance to that of a LiDAR uplifting baseline. The proposed approach is low-latency and produces plausible poses but struggles to estimate absolute depth precisely, particularly at large distances. We alleviate that limitation by implementing a conditional LiDAR-based depth correction.
Multimodales Sensor-Lichtsystem zum Schutz von verletzlichen Verkehrsteilnehmern: 879642 (FFG - Österr. Forschungsförderungs- gesellschaft mbH)
Visual Computing and Human-Centered Technology: 100%