NoiER: An Approach for Training more Reliable Fine-Tuned Downstream Task Models

Jang, Myeongjun; Thomas Lukasiewicz

doi:10.1109/TASLP.2022.3193292

Record link:

http://hdl.handle.net/20.500.12708/146156

Title:

NoiER: An Approach for Training more Reliable Fine-Tuned Downstream Task Models

Citation:

Jang, M., & Thomas Lukasiewicz. (2022). NoiER: An Approach for Training more Reliable Fine-Tuned Downstream Task Models. IEEE/ACM Transactions on Audio, Speech and Language Processing, 30, 2514–2525. https://doi.org/10.1109/TASLP.2022.3193292

Publisher DOI:

10.1109/TASLP.2022.3193292

Publication Type:

Article - Original Research Article

Language:

English

Authors:

Jang, Myeongjun
Thomas Lukasiewicz

Organisational Unit:

E192-07 - Forschungsbereich Artificial Intelligence Techniques

Journal:

IEEE/ACM Transactions on Audio, Speech and Language Processing

ISSN:

2329-9290

Date (published):

22-Jul-2022

Number of Pages:

Publisher:

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Peer reviewed:

Yes

Keywords:

natural language processing; out-of-distribution detection; reliability of language model; text classification

Abstract:

The recent development in pretrained language models that are trained in a self-supervised fashion, such as BERT, is driving rapid progress in natural language processing. However, their brilliant performance is based on leveraging syntactic artefacts of the training data rather than fully understanding the intrinsic meaning of language. The excessive exploitation of spurious artefacts is a problematic issue: the distribution collapse problem, which is the phenomenon that the model fine-tuned on downstream tasks is unable to distinguish out-of-distribution sentences while producing a high-confidence score. In this paper, we argue that the distribution collapse is a prevalent issue in pretrained language models and propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data. The proposed approach improved traditional out-of-distribution detection evaluation metrics by 55% on average compared to the original fine-tuned models.

Research Areas:

Computer Engineering and Software-Intensive Systems: 100%

Science Branch:

1020 - Informatik: 80%
1010 - Mathematik: 20%

Appears in Collections:

Article

Show full item record

Page view(s)

234

checked on Nov 20, 2023

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM