Enhancing breast cancer diagnosis : a mammogram retrieval system and ground truth application

Roriz, Cátia Inês Melo

http://hdl.handle.net/10400.26/51441

Use this identifier to reference this record.

Name:	Description:	Size:	Format:
Cátia Inês Melo_Roriz.pdf		8.75 MB	Adobe PDF	Download

Send Feedback

Authors

Roriz, Cátia Inês Melo

Advisor(s)

Domingues, Inês Campos Monteiro Sabino

Abstract(s)

Breast cancer is a significant global health concern, affecting thousands of individuals, primarily women, with estimated cases expected to climb by 2040. Early-stage diagnosis is essential for effective treatment and better patient outcomes. This dissertation presents a mammogram retrieval system based on the aggregation of image classifiers to aid specialists in diagnosing breast cancer. The system uses a retrieval model that combines the output of multiple classifiers, each targeting different dimensions related to breast cancer diagnosis. These dimensions include breast density, asymmetries, BIRADS classification, calcifications, distortions, laterality, masses, and image incidence. This dissertation also describes the creation of an application to collect ground truth data to aid engineers in the development of a mammography retrieval system. The application is built upon OutSystems, a low-code application platform. Key features of the application include allowing experts to view probe images and associate them with relevant images from the database. Additionally, the platform allows image filtering based on eight mammogram dimensions. While the ultimate goal is to create a system for medical specialists, the current platform represents a step in the process, facilitating the acquisition of ground truth. As for the results obtained from the individual models, in the training set, for the models of each dimension, they reach an average accuracy of around 99.3%, while in the test set, the average accuracy is around 78%. Four approacheswere then developed for the final retrieval model, one assigning equal weights to every dimension, another with empirically defined weights, a third where the weights were defined according to the literature, and a final one where the values of the weights were defined by a specialist. The quantitative results of the final retrieval model according to the four approaches represent the similarity between the probe image and the most similar image (the first image in the top-5). The similarity results are the result of using the individual models in a weighted sum. The first approach scored 0.319, the second 0.191, the third 0.197 and finally the last 0.292