Isychev, A., Wüstholz, V., & Christakis, M. (2025). Lazy Testing of Machine-Learning Models. In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (pp. 7428–7436). https://doi.org/10.24963/ijcai.2025/826
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
-
ISBN:
978-1-956792-06-5
-
Date (published):
2025
-
Event name:
Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2025))
en
Event date:
16-Aug-2025 - 22-Aug-2025
-
Event place:
Montreal, Canada
-
Number of Pages:
9
-
Peer reviewed:
Yes
-
Keywords:
machine learning; testing; static analysis
en
Abstract:
Checking the reliability of machine-learning models is a crucial, but challenging task. Nomos is an existing, automated framework for testing general, user-provided functional properties of models, including so-called hyperproperties expressed over more than one model execution. Nomos aims to find model inputs that expose ``bugs'', that is, property violations. However, performing thousands of model invocations during testing is costly both in terms of time and money (for metered APIs, such as OpenAI's). We present LaZ (pronounced ``lazy''), an extension of Nomos that automatically minimizes the number of model invocations to boost the test throughput and thereby find bugs more efficiently. During test execution, LaZ automatically identifies redundant invocations---invocations where the model output does not affect the final test outcome---and skips them, much like lazy evaluation in certain programming languages. This optimization enables a second one that dynamically reorders model invocations to skip the more expensive ones. As a result, LaZ finds the same number of bugs as Nomos, but does so median 33% and up to 60% faster.
en
Project title:
Structured Doctoral Program on Automated Reasoning: DOC1345324 (FWF - Österr. Wissenschaftsfonds)