Publication
Empowering deaf-hearing communication: exploring synergies between predictive and generative AI-based strategies towards (Portuguese) Sign Language interpretation
dc.contributor.author | Adão, Telmo | |
dc.contributor.author | Oliveira, João | |
dc.contributor.author | Shahrabadi, Somayeh | |
dc.contributor.author | Jesus, Hugo | |
dc.contributor.author | Fernandes, Marco | |
dc.contributor.author | Costa, Ângelo | |
dc.contributor.author | Ferreira, Vânia | |
dc.contributor.author | Gonçalves, Martinho Fradeira | |
dc.contributor.author | Guevara Lopez, Miguel Angel | |
dc.contributor.author | Peres, Emanuel | |
dc.date.accessioned | 2023-11-03T16:10:23Z | |
dc.date.available | 2023-11-03T16:10:23Z | |
dc.date.issued | 2023 | |
dc.description.abstract | Communication between Deaf and hearing individuals remains a persistent challenge requiring attention to foster inclusivity. Despite notable efforts in the development of digital solutions for sign language recognition (SLR), several issues persist, such as cross-platform interoperability and strategies for tokenizing signs to enable continuous conversations and coherent sentence construction. To address such issues, this paper proposes a non-invasive Portuguese Sign Language (Língua Gestual Portuguesa or LGP) interpretation system-as-a-service, leveraging skeletal posture sequence inference powered by long-short term memory (LSTM) architectures. To address the scarcity of examples during machine learning (ML) model training, dataset augmentation strategies are explored. Additionally, a buffer-based interaction technique is introduced to facilitate LGP terms tokenization. This technique provides real-time feedback to users, allowing them to gauge the time remaining to complete a sign, which aids in the construction of grammatically coherent sentences based on inferred terms/words. To support human-like conditioning rules for interpretation, a large language model (LLM) service is integrated. Experiments reveal that LSTM-based neural networks, trained with 50 LGP terms and subjected to data augmentation, achieved accuracy levels ranging from 80% to 95.6%. Users unanimously reported a high level of intuition when using the buffer-based interaction strategy for terms/words tokenization. Furthermore, tests with an LLM—specifically ChatGPT—demonstrated promising semantic correlation rates in generated sentences, comparable to expected sentences. | pt_PT |
dc.description.version | info:eu-repo/semantics/publishedVersion | pt_PT |
dc.identifier.citation | Adão, T., Oliveira, J., Shahrabadi, S., Jesus, H., Fernandes, M., Costa, Â., Ferreira, V., et al. (2023). Empowering Deaf-Hearing Communication: Exploring Synergies between Predictive and Generative AI-Based Strategies towards (Portuguese) Sign Language Interpretation. Journal of Imaging, 9(11), 235. https://doi.org/10.3390/jimaging9110235 | pt_PT |
dc.identifier.doi | https://doi.org/10.3390/jimaging9110235 | pt_PT |
dc.identifier.issn | 2313-433X | |
dc.identifier.uri | http://hdl.handle.net/10400.26/47818 | |
dc.language.iso | eng | pt_PT |
dc.peerreviewed | yes | pt_PT |
dc.relation.publisherversion | https://www.mdpi.com/2313-433X/9/11/235 | pt_PT |
dc.title | Empowering deaf-hearing communication: exploring synergies between predictive and generative AI-based strategies towards (Portuguese) Sign Language interpretation | pt_PT |
dc.type | journal article | |
dspace.entity.type | Publication | |
person.familyName | GUEVARA LÓPEZ | |
person.givenName | MIGUEL ANGEL | |
person.identifier | A-3126-2011 | |
person.identifier.ciencia-id | 8910-E298-D967 | |
person.identifier.orcid | 0000-0001-7814-1653 | |
person.identifier.scopus-author-id | 36999281000 | |
rcaap.rights | openAccess | pt_PT |
rcaap.type | article | pt_PT |
relation.isAuthorOfPublication | 38c91a9b-1db6-4515-9462-b0a031edc325 | |
relation.isAuthorOfPublication.latestForDiscovery | 38c91a9b-1db6-4515-9462-b0a031edc325 |