Iglesias Vazquez, F. (2023, June 15). Unsupervised Learning in Streaming Data Analysis: Insights and Challenges [Presentation]. Expert Days Seminar, Orleans, France.
streaming data analysis; unsupervised learning; concept drift
en
Abstract:
Streaming data analysis faces the challenge of processing data that is continuously acquired and must be analyzed on the fly. This involves managing datasets that have no boundaries and are constantly growing and evolving. The modern consolidation of communications, pervasive computing and cyber-physical systems has made this scenario more and more prevalent, gradually becoming a common environment in the application of AI.
Although the extent of such a technological setting, machine learning today mostly assumes (consciously or unconsciously) static and stationary worlds. However, the time dimension has a decisive impact on data analysis, both in terms of computational costs and accuracy, rendering traditional options unfeasible or unsatisfactory. The effects of concept drift are just beginning to be understood and addressed, being one of the fundamental causes of aging in machine learning models.
On the other hand, the description and modeling of any phenomenon captured in the form of data requires unsupervised learning (either as a central part or as a support) due to the need for facing novelty and the unknown. Here, beyond any algorithmic challenge, at the core of unsupervised methods lies a semantic problem in the definition of targeted objects (e.g., "class", "cluster", "normality", "anomaly"). This must be faced with sophisticated algorithms as well as data-centric perspectives.
Overall, consolidation of streaming data analysis is key in next AI, as well as better evaluation and the characterization of data properties in combination with a deeper understanding of established methodologies.
en
Project (external):
ARD CVL JUNON (environmental metrology and digital twins)