Capturing the narrative : deep learning models for comics sequences

Marouvo, Gonçalo Ventura Lourenço

Publicação

Capturing the narrative : deep learning models for comics sequences

2025-02-03Dissertação de mestrado

dc.contributor.advisor	Pereira, Francisco José Batista
dc.contributor.author	Marouvo, Gonçalo Ventura Lourenço
dc.date.accessioned	2025-03-17T16:30:39Z
dc.date.available	2025-03-17T16:30:39Z
dc.date.issued	2025-02-03
dc.description.abstract	Comics represent the complexway humans can communicate and expose ideas, which pose additional challenges for image-to-text deep learning models. In this project, we investigate howmultimodal deep learning architectures performin describing a comics vignette. We investigate howcurrent State-of-the-Art models (GIT and BLIP-2) are able to describe the narrative in 4-images comics sequence from a dataset we created. We find that some prompting can produce acceptable results. We also assess how to propagate information across the sequence’s images, by adding to prompts the previous outputs of the images from the same sequence. The results show limited improvements from this strategy. While the overall meaning of the predicted descriptions is close to the semantic space of the real descriptions, they are still far away from human-level descriptions. Therefore we propose several future experiments, where we highlight reinforcement learning to train a large language model as a policy function for prompt generation.	pt_PT
dc.identifier.tid	203894898	pt_PT
dc.identifier.uri	http://hdl.handle.net/10400.26/57302
dc.language.iso	eng	pt_PT
dc.subject	Comics
dc.subject	Computer vision
dc.subject	Image captioning
dc.subject	Multimodal Deep Learning Models
dc.subject	Prompt engineering
dc.title	Capturing the narrative : deep learning models for comics sequences	pt_PT
dc.type	master thesis
dspace.entity.type	Publication
rcaap.rights	openAccess	pt_PT
rcaap.type	masterThesis	pt_PT

Ficheiros

Principais

A mostrar 1 - 1 de 1

Nome:: Goncalo-Ventura-Lourenco-Marouvo.pdf
Tamanho:: 8.93 MB
Formato:: Adobe Portable Document Format

Ver/Abrir

Licença

A mostrar 1 - 1 de 1

Nome:: license.txt
Tamanho:: 1.85 KB
Formato:: Item-specific license agreed upon to submission
Descrição:

Ver/Abrir

Coleções

ISEC - Trabalhos de Projeto | Relatórios de Estágio | Projetos de Investigação