<div class="csl-bib-body">
<div class="csl-entry">Mucha, W., & Kampel, M. (2024). In My Perspective, in My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition. In <i>2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG)</i> (pp. 1–9). https://doi.org/10.1109/FG59268.2024.10582035</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/204354
-
dc.description.abstract
Action recognition is essential for egocentric video understanding, allowing automatic and continuous monitoring of Activities of Daily Living (ADLs) without user effort. Existing literature focuses on 3D hand pose input, which requires computationally intensive depth estimation networks or wearing an uncomfortable depth sensor. In contrast, there has been insufficient research in understanding 2D hand pose for egocentric action recognition, despite the availability of user-friendly smart glasses in the market capable of capturing a single RGB image. Our study aims to fill this research gap by exploring the field of 2D hand pose estimation for egocentric action recognition, making two contributions. Firstly, we introduce two novel approaches for 2D hand pose estimation, namely EffHandNet for single-hand estimation and EffHandEgoNet, tailored for an egocentric perspective, capturing interactions between hands and objects. Both methods outperform state-of-the-art models on H2O and FPHA public benchmarks. Secondly, we present a robust action recognition architecture from 2D hand and object poses. This method incorporates EffHandEgoNet, and a transformer-based action recognition method. Evaluated on H2O and FPHA datasets, our architecture has a faster inference time and achieves an accuracy of 91.32% and 94.43%, respectively, surpassing state of the art, including 3D-based methods. Our work demonstrates that using 2D skeletal data is a robust approach for egocentric action understanding. Extensive evaluation and ablation studies show the impact of the hand pose estimation approach, and how each input affects the overall performance. The code is available at https://github.com/wiktormucha/effhandegonet.
en
dc.description.sponsorship
European Commission
-
dc.language.iso
en
-
dc.subject
Egocentric Vision
en
dc.subject
Action Recognition
en
dc.subject
Smart glasses
en
dc.title
In My Perspective, in My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.isbn
9798350394948
-
dc.description.startpage
1
-
dc.description.endpage
9
-
dc.relation.grantno
861091
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG)
-
tuw.peerreviewed
true
-
tuw.project.title
Privacy-Aware and Acceptable Video-Based Technologies and Services for Active and Assisted Living
-
tuw.researchTopic.id
I5
-
tuw.researchTopic.name
Visual Computing and Human-Centered Technology
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E193-01 - Forschungsbereich Computer Vision
-
tuw.publisher.doi
10.1109/FG59268.2024.10582035
-
dc.description.numberOfPages
9
-
tuw.author.orcid
0000-0002-6048-3425
-
tuw.author.orcid
0000-0002-5217-2854
-
tuw.event.name
2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG)
en
tuw.event.startdate
27-05-2024
-
tuw.event.enddate
31-05-2024
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Istanbul
-
tuw.event.country
TR
-
tuw.event.presenter
Mucha, Wiktor
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.languageiso639-1
en
-
item.openairetype
conference paper
-
item.grantfulltext
restricted
-
item.fulltext
no Fulltext
-
item.cerifentitytype
Publications
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.orcid
0000-0002-6048-3425
-
crisitem.author.orcid
0000-0002-5217-2854
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology