Knowledge Transfer from Large Vision-Language Models for Localization and Segmentation in 2D Medical Imaging

Τριανταφύλλης, Γεώργιος

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19577

Title:	Knowledge Transfer from Large Vision-Language Models for Localization and Segmentation in 2D Medical Imaging
Authors:	Τριανταφύλλης, Γεώργιος Βουλόδημος Αθανάσιος
Keywords:	Medical Imaging, Grounded Segmentation, Large Vision-Language Models, Fine-Tuning, GroundingDINO, SAM2, MedSAM2, MRI, CT
Issue Date:	26-Mar-2025
Abstract:	Grounded segmentation of medical images is a challenging task requiring expert annotated datasets, which are scarce. To address this problem, we employ Large Vision Language Models (LVLMs) as well as deterministic algorithms to generate the missing textual descriptions for organ masks. For the grounded segmentation task, a pipeline is developed consisting of GroundingDINO and SAM2 or Med-SAM2 with only GroundingDINO being fine-tuned. The dataset used for this study is RAOS, which includes CT scans and synthetic MRI images. Our experiments assess the accuracy of LLaVA-Med’s responses and the performance of the proposed fine-tuned pipeline to various prompting strategies on both in-distribution and out-of-distribution images. The results indicate that LLaVA-Med alone cannot reliably generate the textual descriptions due to its limited reasoning ability. Additionally, our results show that the proposed pipeline performs well within the closed setting in which it was applied while acknowledging inherent limitations.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19577
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
DIPLOMATIKI_english.pdf		7.68 MB	Adobe PDF	View/Open

Show full item record