MvCo-DoT: Multi-View Contrastive Domain Transfer Network for Medical Report Generation

Wang, Ruizhi; Wang, Xiangtao; Xu, Zhenghua; Xu, Wenting; Chen, Junyang; Lukasiewicz, Thomas

doi:10.1109/ICASSP49357.2023.10095254

DC Field

Value

Language

dc.contributor.author

Wang, Ruizhi

dc.contributor.author

Wang, Xiangtao

dc.contributor.author

Xu, Zhenghua

dc.contributor.author

Xu, Wenting

dc.contributor.author

Chen, Junyang

dc.contributor.author

Lukasiewicz, Thomas

dc.date.accessioned

2024-01-23T10:32:37Z

dc.date.available

2024-01-23T10:32:37Z

dc.date.issued

2023

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Wang, R., Wang, X., Xu, Z., Xu, W., Chen, J., & Lukasiewicz, T. (2023). MvCo-DoT: Multi-View Contrastive Domain Transfer Network for Medical Report Generation. In <i>ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</i>. 2023 International Conference on Acoustics, Speech, and Signal Processing, Rhodes, Greece. IEEE. https://doi.org/10.1109/ICASSP49357.2023.10095254</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/192514

dc.description.abstract

In clinical scenarios, multiple medical images with different views are usually generated at the same time, and they have high semantic consistency. However, the existing medical report generation methods cannot exploit the rich multi-view mutual information of medical images. Therefore, in this work, we propose the first multi-view medical report generation model, called MvCo-DoT. Specifically, MvCo-DoT first propose a multi-view contrastive learning (MvCo) strategy to help the deep reinforcement learning based model utilize the consistency of multi-view inputs for better model learning. Then, to close the performance gaps of using multi-view and single-view inputs, a domain transfer network is further proposed to ensure MvCo-DoT achieve almost the same performance as multi-view inputs using only single-view inputs. Extensive experiments on the IU X-Ray public dataset show that MvCo-DoT outperforms the SOTA medical report generation baselines in all metrics.

dc.language.iso

dc.subject

medical report generation

dc.subject

multi-view medical report generation

dc.title

MvCo-DoT: Multi-View Contrastive Domain Transfer Network for Medical Report Generation

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.relation.publication

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

dc.contributor.affiliation

Hebei University of Technology, China

dc.contributor.affiliation

Hebei University of Technology, China

dc.contributor.affiliation

Hebei University of Technology, China

dc.contributor.affiliation

Hebei University of Technology, China

dc.contributor.affiliation

Shenzhen University, China

dc.relation.isbn

978-1-7281-6327-7

dc.relation.doi

10.1109/ICASSP49357.2023

dc.relation.issn

1520-6149

dc.type.category

Full-Paper Contribution

dc.relation.eissn

2379-190X

tuw.booktitle

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

tuw.peerreviewed

true

tuw.relation.publisher

IEEE

tuw.relation.publisherplace

Piscataway

tuw.researchTopic.id

tuw.researchTopic.name

Information Systems Engineering

tuw.researchTopic.value

100

tuw.publication.orgunit

E192-07 - Forschungsbereich Artificial Intelligence Techniques

tuw.publication.orgunit

E192-03 - Forschungsbereich Knowledge Based Systems

tuw.publisher.doi

10.1109/ICASSP49357.2023.10095254

dc.description.numberOfPages

tuw.event.name

2023 International Conference on Acoustics, Speech, and Signal Processing

tuw.event.startdate

04-06-2023

tuw.event.enddate

10-06-2023

tuw.event.online

On Site

tuw.event.type

Event for scientific audience

tuw.event.place

Rhodes

tuw.event.country

tuw.event.presenter

Wang, Ruizhi

wb.sciencebranch

Informatik

wb.sciencebranch

Mathematik

wb.sciencebranch.oefos

1020

wb.sciencebranch.oefos

1010

wb.sciencebranch.value

item.languageiso639-1

item.openairetype

conference paper

item.grantfulltext

none

item.fulltext

no Fulltext

item.cerifentitytype

Publications

item.openairecristype

http://purl.org/coar/resource_type/c_5794

crisitem.author.dept

E192-07 - Forschungsbereich Artificial Intelligence Techniques

crisitem.author.dept

Hebei University of Technology

crisitem.author.dept

Hebei University of Technology

crisitem.author.dept

Hebei University of Technology

crisitem.author.dept

Shenzhen University

crisitem.author.dept

E192-07 - Forschungsbereich Artificial Intelligence Techniques

crisitem.author.parentorg

E192 - Institut für Logic and Computation

crisitem.author.parentorg

E192 - Institut für Logic and Computation

Appears in Collections:

Conference Paper

Show simple item record

Page view(s)

165

checked on Jan 23, 2024

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM