Fink, L., Kostolani, D., Trautner, T. F., & Schlund, S. (2024). Make some Noise: Acoustic Classification of Manual Work Steps Towards Adaptive Assistance Systems. In 10th CIRP Conference on Assembly Technology and Systems (CIRP CATS 2024) (pp. 135–140). https://doi.org/10.1016/j.procir.2024.07.024
10th CIRP Conference on Assembly Technology and Systems (CIRP CATS 2024)
-
Volume:
127
-
Date (published):
2024
-
Event name:
10th CIRP Conference on Assembly Technology and Systems (CIRP CATS 2024)
en
Event date:
24-Apr-2024 - 26-Apr-2024
-
Event place:
Karlsruhe, Germany
-
Number of Pages:
6
-
Peer reviewed:
Yes
-
Keywords:
Adaptive Assistance; CNN; Deep Learning; Human Action Recognition; Log-Mel Spectrogram
en
Abstract:
With 32 million people working in the European manufacturing sector, human work still plays a crucial role in industry. However, due to lot size one manufacturing and increased quality requirements, the complexity of products and processes is growing. Therefore, numerous approaches introduce adaptive assistance systems in assembly to support workers during complex work tasks and adapt their level of assistance to the current situation. To enable adaptivity, human action recognition must be incorporated into the consideration of context. Until now, research has focused on providing context to the machine through wearable sensors or cameras. However, wearable sensors hinder worker's movements and cameras have difficulties distinguishing between work steps of high visual similarity. To mitigate these challenges, we present a new method to classify manual work steps only by their typical acoustic characteristics. The proposed method uses log-Mel spectrograms of work sounds fed into a convolutional neural network (CNN), thus learning their characteristic structure. Moreover, we present a new public dataset for the acoustic classification of manual work steps. The dataset includes typical sources of sounds in manufacturing, such as working with a bench grinder, cordless screwdriver, fling, or grabbing screws. Before feeding the data to the CNN, we apply various pre-processing and data augmentation techniques to increase generalisation capabilities. Our method can detect work steps with reliable accuracy while requiring less parameters than other techniques, proving that detecting work context through acoustics is possible and feasible.
en
Research facilities:
Pilotfabrik
-
Research Areas:
Digital Transformation in Manufacturing: 60% Visual Computing and Human-Centered Technology: 40%