Latent diffusion for generative visual attribution in medical image diagnostics

Masters thesis

Siddiqui, A. 2023. Latent diffusion for generative visual attribution in medical image diagnostics. Masters thesis Middlesex University Computer Science
TypeMasters thesis
TitleLatent diffusion for generative visual attribution in medical image diagnostics
AuthorsSiddiqui, A.

Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image, in contrast to the more common detec-tion of diseased tissue deployed in conventional machine vision pipelines (due to the inherent learning nature of these latter models, they are typically not easily inter-pretable/explainable to clinicians). State-of-the-art techniques in visual attribution generally consist of different variants of deep neural networks, implemented as clas-sifiers, or segmenters. However, they have not thus far included an explicit linguistic component.

We here present a novel generative visual attribution technique, one that leverages latent diffusion models in combination with domain-specific large language models, in order to generate normal counterparts of abnormal images. The discrepancy between the two hence gives rise to a mapping indicating the diagnostically-relevant image components. To achieve this, we deploy image priors in conjunction with appropriate conditioning mechanisms in order to control the image generative process, including natural language text prompts acquired from medical science and applied radiology. We perform experiments and quantitatively evaluate our results on the COVID-19 Radiography Database containing labelled chest X-rays with differing pathologies via the Frechet Inception Distance (FID), Structural Similarity (SSIM) and Multi Scale Structural Similarity Metric (MS-SSIM) metrics obtained between real and generated images.

The resulting system also exhibits a range of latent capabilities including super-resolution and zero-shot localized disease induction, which are evaluated with real examples from the cheXpert dataset.

KeywordsVisual Attribution; Explainable AI; Diffusion models; Medical imaging
Sustainable Development Goals9 Industry, innovation and infrastructure
3 Good health and well-being
Middlesex University ThemeHealth & Wellbeing
Department nameComputer Science
Science and Technology
Institution nameMiddlesex University
PublisherMiddlesex University Research Repository
Publication dates
Online28 Mar 2024
Publication process dates
Accepted18 Sep 2023
Deposited28 Mar 2024
Output statusPublished
Accepted author manuscript
File Access Level
Permalink -

Restricted files

Accepted author manuscript

  • 5
    total views
  • 0
    total downloads
  • 3
    views this month
  • 0
    downloads this month

Export as