A real-time approach to human behavior analysis using depth and thermal imaging

Heitzinger, Thomas

doi:10.34726/hss.2025.133225

Record link:

https://doi.org/10.34726/hss.2025.133225
http://hdl.handle.net/20.500.12708/220531

Title:

A real-time approach to human behavior analysis using depth and thermal imaging

Citation:

Heitzinger, T. (2025). A real-time approach to human behavior analysis using depth and thermal imaging [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.133225

reposiTUm DOI:

10.34726/hss.2025.133225

CatalogPlus:

AC17685568

Publication Type:

Thesis - Dissertation

Language:

English

Authors:

Heitzinger, Thomas

Advisor:

Kampel, Martin

Organisational Unit:

E193 - Institut für Visual Computing and Human-Centered Technology

Date (published):

2025

Number of Pages:

179

Keywords:

Deep Learning; Computer Vision; Machine Learning; 3D Vision; Thermal Daten; Human Behavior Analysis

Deep Learning; Computer Vision; Machine Learning; Depth Imaging; Thermal Imaging; Human Behavior Analysis

Abstract:

This thesis explores the use of depth and thermal imaging as an alternative to RGB in the context of 3D human behavior analysis and the implementation of lightweight models that support real-time inference on resource-constrained edge devices. The difference in the sensing principle of the proposed modalities is seen as an opportunity to support use-cases where relying solely on RGB data may lead to unsatisfactory results. Examples are crowded environments — benefiting from depth information or human thermal characteristics to separate humans from backgrounds —, scenes with low or inconsistent illumination, and applications that favor depth and thermal sensing for their reduced capture of personal information. The choice to target inference near the sensor further serves to reduce storage and dissemination of personal information, since only low-dimensional model outputs rather than raw sensor data are transmitted from the device. The behavior analysis capabilities developed in this work target 3D multi-person detection in space and time in the form of 3D object detection and online tracking. Person-centric attributes of interest also include poses and actions which are implemented as classification tasks. In the case of action classification, the particular focus lies on the differentiation between friendly and aggressive behavior. Contributions of the work span in equal parts the creation of multimodal datasets for 3D human behavior analysis and methodological research in the domains of detection and tracking, culminating in the creation of a novel system termed FUS3D that combines all of the developed capabilities into a single end-to-end trainable network. Further contributions are the fabrication of a portable, lightweight recording and inference platform, and a task-agnostic 3D annotation tool supporting arbitrary image-like data modalities.

Additional information:

Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers

License:

In Copyright

Appears in Collections:

Thesis