<div class="csl-bib-body">
<div class="csl-entry">Valls Mascaro, E., Ahn, H., & Lee, D. (2023). Intention-Conditioned Long-Term Human Egocentric Action Forecasting. In <i>Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</i> (pp. 6048–6057). https://doi.org/10.1109/WACV56688.2023.00599</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/189698
-
dc.description.abstract
To anticipate how a person would act in the future, it is essential to understand the human intention since it guides the subject towards a certain action. In this paper, we propose a hierarchical architecture which assumes a sequence of human action (low-level) can be driven from the human intention (high-level). Based on this, we deal with long-term action anticipation task in egocentric videos. Our framework first extracts this low- and high-level human information over the observed human actions in a video through a Hierarchical Multi-task Multi-Layer Perceptrons Mixer (H3M). Then, we constrain the uncertainty of the future through an Intention-Conditioned Variational Auto-Encoder (I-CVAE) that generates multiple stable predictions of the next actions that the observed human might perform. By leveraging human intention as high-level information, we claim that our model is able to anticipate more time-consistent actions in the long-term, thus improving the results over the baseline in Ego4D dataset. This work results in the state-of-the-art for Long-Term Anticipation (LTA) task in Ego4D by providing more plausible anticipated sequences, improving the anticipation scores of nouns and actions. Our work ranked first in both CVPR@2022 and ECCV@2022 Ego4D LTA Challenge.
en
dc.description.sponsorship
European Commission
-
dc.language.iso
en
-
dc.subject
Long-Term Anticipation
en
dc.subject
Computer Vision
en
dc.subject
Egocentric Video
en
dc.subject
Generative Models
en
dc.title
Intention-Conditioned Long-Term Human Egocentric Action Forecasting
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
Artificial Intelligence Graduate School (AIGS), Ulsan National Institute of Science and Technology (UNIST), Ulsan, Korea
-
dc.relation.isbn
978-1-6654-9346-8
-
dc.relation.issn
2642-9381
-
dc.description.startpage
6048
-
dc.description.endpage
6057
-
dc.relation.grantno
H2020-MSCA-ITN-2019
-
dcterms.dateSubmitted
2022-08
-
dc.rights.holder
Computer Vision Foundation
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
-
tuw.project.title
PErsonalized Robotics as SErvice Oriented applications