Maximizing Data Efficiency of HTR Models by Synthetic Text

Muth, Markus; Peer, Marco; Kleber, Florian; Sablatnig, Robert

doi:10.1007/978-3-031-70442-0_18

DC Field

Value

Language

dc.contributor.author

Muth, Markus

dc.contributor.author

Peer, Marco

dc.contributor.author

Kleber, Florian

dc.contributor.author

Sablatnig, Robert

dc.date.accessioned

2024-10-28T16:55:22Z

dc.date.available

2024-10-28T16:55:22Z

dc.date.issued

2024

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Muth, M., Peer, M., Kleber, F., & Sablatnig, R. (2024). Maximizing Data Efficiency of HTR Models by Synthetic Text. In <i>Document Analysis Systems</i> (pp. 295–311). Springer, Cham. https://doi.org/10.1007/978-3-031-70442-0_18</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/203694

dc.description.abstract

The usability of synthetic handwritten text to improve machine learning models is assessed for the domain of HTR. Synthetic handwritten text is generated using an existing model based on a GAN. The output of this model is then used to train a state-of-the-art HTR model, which is then applied to recognize real datasets. While this results in a CER of 28.3% and a WER of 65.5% for line images of the IAM dataset - more than three times higher than the state-of-the-art result - our experiments show that the amount of real data in a mixed training set can be significantly reduced (70–80%) to achieve comparable CER and WER rates as with real data. Using only 10% of the training data (113 images) from the CVL dataset results in a CER of 54.5% and a WER of 88.8%, pre-training the model with synthetic data results in a CER of 14.6% and a WER of 43.4%.

dc.language.iso

dc.relation.ispartofseries

Lecture Notes in Computer Science

dc.subject

Handwritten Text Recognition

dc.subject

Synthetic Data

dc.subject

Synthetic Text

dc.title

Maximizing Data Efficiency of HTR Models by Synthetic Text

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.relation.isbn

978-3-031-70442-0

dc.description.startpage

295

dc.description.endpage

311

dc.type.category

Full-Paper Contribution

tuw.booktitle

Document Analysis Systems

tuw.container.volume

14994

tuw.peerreviewed

true

tuw.relation.publisher

Springer, Cham

tuw.researchTopic.id

tuw.researchTopic.name

Visual Computing and Human-Centered Technology

tuw.researchTopic.value

100

tuw.publication.orgunit

E193-01 - Forschungsbereich Computer Vision

tuw.publisher.doi

10.1007/978-3-031-70442-0_18

dc.description.numberOfPages

tuw.author.orcid

0000-0001-6843-0830

tuw.author.orcid

0000-0001-8351-5066

tuw.author.orcid

0000-0003-4195-1593

tuw.event.name

16th IAPR International Workshop on Document Analysis Systems (DAS 2024)

tuw.event.startdate

30-08-2024

tuw.event.enddate

31-08-2024

tuw.event.online

On Site

tuw.event.type

Event for scientific audience

tuw.event.place

Athen

tuw.event.country

tuw.event.presenter

Kleber, Florian

wb.sciencebranch

Informatik

wb.sciencebranch

Mathematik

wb.sciencebranch.oefos

1020

wb.sciencebranch.oefos

1010

wb.sciencebranch.value

item.languageiso639-1

item.openairetype

conference paper

item.grantfulltext

none

item.fulltext

no Fulltext

item.cerifentitytype

Publications

item.openairecristype

http://purl.org/coar/resource_type/c_5794

crisitem.author.dept

E193-01 - Forschungsbereich Computer Vision

crisitem.author.dept

E193-01 - Forschungsbereich Computer Vision

crisitem.author.dept

E193 - Institut für Visual Computing and Human-Centered Technology

crisitem.author.orcid

0000-0001-6843-0830

crisitem.author.orcid

0000-0001-8351-5066

crisitem.author.orcid

0000-0003-4195-1593

crisitem.author.parentorg

E193 - Institut für Visual Computing and Human-Centered Technology

crisitem.author.parentorg

E193 - Institut für Visual Computing and Human-Centered Technology

crisitem.author.parentorg

E180 - Fakultät für Informatik

Appears in Collections:

Conference Paper

Show simple item record

Page view(s)

checked on Oct 29, 2024

Download(s)

checked on Oct 29, 2024

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM