<div class="csl-bib-body">
<div class="csl-entry">Scheidl, H., Fiel, S., & Sablatnig, R. (2018). Word Beam Search: A Connectionist Temporal Classification Decoding Algorithm. In <i>16th International Conference on Frontiers in Handwriting Recognition (ICFHR 2018)</i>. https://doi.org/10.1109/ICFHR-2018.2018.00052</div>
</div>
Recurrent Neural Networks (RNNs) are used for sequence recognition tasks such as Handwritten Text Recognition (HTR) or speech recognition. If trained with the Connectionist Temporal Classification (CTC) loss function, the output of such a RNN is a matrix containing character probabilities for each time-step. A CTC decoding algorithm maps these character probabilities to the final text. Token passing is such an algorithm and is able to constrain the recognized text to a sequence of dictionary words. However, the running time of token passing depends quadratically on the dictionary size and it is not able to decode arbitrary character strings like numbers. This paper proposes word beam search decoding, which is able to tackle these problems. It constrains words to those contained in a dictionary, allows arbitrary non-word character strings between words, optionally integrates a word-level language model and has a better running time than token passing. The proposed algorithm outperforms best path decoding, vanilla beam search decoding and token passing on the IAM and Bentham HTR datasets. An open-source implementation is provided.
en
dc.description.sponsorship
European Union's Horizon 2020
-
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC-EDU/1.0/
-
dc.subject
connectionist temporal classification
en
dc.subject
decoding
en
dc.subject
language model
en
dc.subject
recurrent neural network
en
dc.subject
speech recognition
en
dc.subject
handwritten text recognition
en
dc.title
Word Beam Search: A Connectionist Temporal Classification Decoding Algorithm
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.rights.license
In Copyright - Educational Use Permitted
en
dc.rights.license
Urheberrechtsschutz - Nutzung zu Bildungszwecken erlaubt
de
dc.relation.publication
16th International Conference on Frontiers in Handwriting Recognition (ICFHR 2018)
-
dc.relation.isbn
9781538658758
-
dc.relation.grantno
674943
-
dc.rights.holder
2018 IEEE
-
dc.type.category
Full-Paper Contribution
-
tuw.relation.publisherplace
Niagara Falls, New York, USA
-
tuw.version
am
-
tuw.publication.orgunit
E193 - Institut für Visual Computing and Human-Centered Technology
-
tuw.publisher.doi
10.1109/ICFHR-2018.2018.00052
-
dc.identifier.libraryid
AC15148695
-
dc.identifier.urn
urn:nbn:at:at-ubtuw:3-3778
-
tuw.author.orcid
0000-0001-5033-6723
-
tuw.author.orcid
0000-0003-4195-1593
-
dc.rights.identifier
In Copyright - Educational Use Permitted
en
dc.rights.identifier
Urheberrechtsschutz - Nutzung zu Bildungszwecken erlaubt
de
item.openaccessfulltext
Open Access
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.grantfulltext
open
-
item.mimetype
application/pdf
-
item.languageiso639-1
en
-
item.openairetype
conference paper
-
item.fulltext
with Fulltext
-
item.cerifentitytype
Publications
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
E193 - Institut für Visual Computing and Human-Centered Technology
-
crisitem.author.orcid
0000-0003-4195-1593
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology