UnbiasedNets: a dataset diversification framework for robustness bias alleviation in neural networks

Naseer, Mahum; Prabakaran, Bharath Srinivas; Hasan, Osman; Shafique, Muhammad

doi:10.1007/s10994-023-06314-z

Record link:

http://hdl.handle.net/20.500.12708/192607

Title:

UnbiasedNets: a dataset diversification framework for robustness bias alleviation in neural networks

Citation:

Naseer, M., Prabakaran, B. S., Hasan, O., & Shafique, M. (2023). UnbiasedNets: a dataset diversification framework for robustness bias alleviation in neural networks. Machine Learning. https://doi.org/10.1007/s10994-023-06314-z

Publisher DOI:

10.1007/s10994-023-06314-z

CatalogPlus:

AC17205029

Publication Type:

Article - Original Research Article

Language:

English

Authors:

Naseer, Mahum
Prabakaran, Bharath Srinivas
Hasan, Osman
Shafique, Muhammad

Organisational Unit:

E191-01 - Forschungsbereich Cyber-Physical Systems
E191-02 - Forschungsbereich Embedded Computing Systems

Journal:

Machine Learning

ISSN:

0885-6125

Date (published):

2023

Number of Pages:

Publisher:

Springer

Peer reviewed:

Yes

Keywords:

Bias; Data-centric bias alleviation; K-means clustering; Neural networks; Noise tolerance

Abstract:

Performance of trained neural network (NN) models, in terms of testing accuracy, has improved remarkably over the past several years, especially with the advent of deep learning. However, even the most accurate NNs can be biased toward a specific output classification due to the inherent bias in the available training datasets, which may propagate to the real-world implementations. This paper deals with the robustness bias, i.e., the bias exhibited by the trained NN by having a significantly large robustness to noise for a certain output class, as compared to the remaining output classes. The bias is shown to result from imbalanced datasets, i.e., the datasets where all output classes are not equally represented. Towards this, we propose the UnbiasedNets framework, which leverages K-means clustering and the NN’s noise tolerance to diversify the given training dataset, even from relatively smaller datasets. This generates balanced datasets and reduces the bias within the datasets themselves. To the best of our knowledge, this is the first framework catering to the robustness bias problem in NNs. We use real-world datasets to demonstrate the efficacy of the UnbiasedNets for data diversification, in case of both binary and multi-label classifiers. The results are compared to well-known tools aimed at generating balanced datasets, and illustrate how existing works have limited success while addressing the robustness bias. In contrast, UnbiasedNets provides a notable improvement over existing works, while even reducing the robustness bias significantly in some cases, as observed by comparing the NNs trained on the diversified and original datasets.

Project title:

Accelerating Innovation in Microfabricated Medical Devices: 876190 (European Commission)

Research Areas:

Computer Engineering and Software-Intensive Systems: 100%

Science Branch:

1020 - Informatik: 100%

License:

CC BY 4.0