<div class="csl-bib-body">
<div class="csl-entry">Styll, P., Campillos-Llanos, L., Kusa, W., & Hanbury, A. (2024). Cross-Linguistic Disease and Drug Detection in Cardiology Clinical Texts: Methods and Outcomes. In <i>Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024)</i> (pp. 223–244). http://hdl.handle.net/20.500.12708/210253</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/210253
-
dc.description.abstract
This paper presents our approach to the MultiCardioNER lab at CLEF2024, focusing on disease detection in Spanish texts and drug detection in Italian, Spanish, and English texts. We enhance model performance through several strategies: (1) fine-tuning on automatically translated TREC Clinical Trials admission notes using Masked Language Modeling (MLM); (2) data augmentation with translated MTSamples processed through a Spanish medical lexicon (MedLexSp) for accurate vocabulary matching; and (3) employing sliding windows with overlap to improve data capture. Additionally, we use transfer learning with a clinical trials corpus (CT-EMB-SP) to refine the outcomes. We further fine-tune several already established disease and drug extraction models to leverage their extensive vocabulary and compare their performance to models trained from scratch. Our methods and experiments demonstrate notable improvements in multilingual clinical NER, as evidenced by our track results.
en
dc.language.iso
en
-
dc.relation.ispartofseries
CEUR Workshop Proceedings
-
dc.subject
Clinical Named Entity Recognition
en
dc.subject
Transfer Learning
en
dc.subject
Data Augmentation
en
dc.subject
Cardiology
en
dc.title
Cross-Linguistic Disease and Drug Detection in Cardiology Clinical Texts: Methods and Outcomes
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
TU Wien, Austria
-
dc.contributor.affiliation
Consejo Superior de Investigaciones Científicas, Spain
-
dc.relation.issn
1613-0073
-
dc.description.startpage
223
-
dc.description.endpage
244
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024)
-
tuw.container.volume
3740
-
tuw.peerreviewed
true
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E194-04 - Forschungsbereich Data Science
-
dc.description.numberOfPages
22
-
tuw.author.orcid
0000-0003-3040-1756
-
tuw.author.orcid
0000-0003-4420-4147
-
tuw.author.orcid
0000-0002-7149-5843
-
tuw.event.name
Conference and Labs of the Evaluation Forum (CLEF 2024) : Information Access Evaluation meets Multilinguality, Multimodality, and Visualization
-
tuw.event.startdate
09-09-2024
-
tuw.event.enddate
12-09-2024
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Grenoble
-
tuw.event.country
FR
-
tuw.event.institution
University of Grenoble Alpes
-
tuw.event.presenter
Styll, Patrick
-
tuw.event.track
Multi Track
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Wirtschaftswissenschaften
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
5020
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.openairetype
conference paper
-
item.fulltext
no Fulltext
-
item.languageiso639-1
en
-
item.grantfulltext
none
-
item.cerifentitytype
Publications
-
crisitem.author.dept
TU Wien
-
crisitem.author.dept
Consejo Superior de Investigaciones Científicas, Spain
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.orcid
0000-0003-3040-1756
-
crisitem.author.orcid
0000-0003-4420-4147
-
crisitem.author.orcid
0000-0002-7149-5843
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering