<div class="csl-bib-body">
<div class="csl-entry">Mucha, W., Cuconasu, F., Etori, N. A., Kalokyri, V., & Trappolini, G. (2024). TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model. In <i>Computers Helping People with Special Needs</i> (pp. 285–291). https://doi.org/10.1007/978-3-031-62849-8_35</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/204357
-
dc.description.abstract
The ability to read, understand and find important information from written text is a critical skill in our daily lives for our independence, comfort and safety. However, a significant part of our society is affected by partial vision impairment, which leads to discomfort and dependency in daily activities. To address the limitations of this part of society, we propose an intelligent reading assistant based on smart glasses with embedded RGB cameras and a Large Language Model (LLM), whose functionality goes beyond corrective lenses. The video recorded from the egocentric perspective of a person wearing the glasses is processed to localise text information using object detection and optical character recognition methods. The LLM processes the data and allows the user to interact with the text and responds to a given query, thus extending the functionality of corrective lenses with the ability to find and summarize knowledge from the text. To evaluate our method, we create a chat-based application that allows the user to interact with the system. The evaluation is conducted in a real-world setting, such as reading menus in a restaurant, and involves four participants. The results show robust accuracy in text retrieval. The system not only provides accurate meal suggestions but also achieves high user satisfaction, highlighting the potential of smart glasses and LLMs in assisting people with special needs.
en
dc.description.sponsorship
European Commission
-
dc.language.iso
en
-
dc.relation.ispartofseries
Lecture Notes in Computer Science
-
dc.subject
AAL
en
dc.subject
Assistive Technology (AT)
en
dc.subject
egocentric vision
en
dc.subject
LLM
en
dc.subject
reading assistance
en
dc.title
TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.isbn
978-3-031-62849-8
-
dc.description.startpage
285
-
dc.description.endpage
291
-
dc.relation.grantno
861091
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
Computers Helping People with Special Needs
-
tuw.container.volume
14751
-
tuw.peerreviewed
true
-
tuw.project.title
Privacy-Aware and Acceptable Video-Based Technologies and Services for Active and Assisted Living
-
tuw.researchTopic.id
I5
-
tuw.researchTopic.name
Visual Computing and Human-Centered Technology
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E193-01 - Forschungsbereich Computer Vision
-
tuw.publisher.doi
10.1007/978-3-031-62849-8_35
-
dc.description.numberOfPages
7
-
tuw.author.orcid
0000-0002-6048-3425
-
tuw.author.orcid
0009-0008-9768-1047
-
tuw.author.orcid
0000-0001-7772-1103
-
tuw.author.orcid
0000-0002-5245-8238
-
tuw.author.orcid
0000-0002-5515-634X
-
tuw.event.name
19th International Conference on Computers Helping People with Special Need (ICCHP 2024)
en
tuw.event.startdate
08-07-2024
-
tuw.event.enddate
12-08-2024
-
tuw.event.online
Hybrid
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Linz
-
tuw.event.country
AT
-
tuw.event.presenter
Mucha, Wiktor
-
tuw.presentation.online
Online
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.languageiso639-1
en
-
item.openairetype
conference paper
-
item.grantfulltext
restricted
-
item.fulltext
no Fulltext
-
item.cerifentitytype
Publications
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.orcid
0000-0002-6048-3425
-
crisitem.author.orcid
0009-0008-9768-1047
-
crisitem.author.orcid
0000-0001-7772-1103
-
crisitem.author.orcid
0000-0002-5245-8238
-
crisitem.author.orcid
0000-0002-5515-634X
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology