Skip to main navigation Skip to search Skip to main content

Multi-scale feature enhancement in multi-task learning for medical image analysis

  • Sungkyunkwan University

Research output: Contribution to journalArticlepeer-review

Abstract

Traditional deep learning approaches in medical image analysis usually focus on either segmentation or classification, which limits their ability to exploit shared information between these interrelated tasks. Recent multi-task learning (MTL) methods aim to address this limitation by combining both tasks within a single model through shared feature representations. However, existing MTL models often fall short of optimal performance in both tasks, as they struggle to simultaneously capture the local contextual information essential for segmentation and the global one needed for classification. In this paper, we propose a simple yet effective UNet-based MTL model, where features extracted by the encoder are used to predict classification labels, while the decoder produces the segmentation mask. The model leverages an advanced encoder incorporating a novel ResFormer block that integrates local context from convolutional feature extraction with long-range dependencies modeled by the Transformer. This design captures broader contextual relationships and fine-grained details, improving classification and segmentation accuracy. To enhance classification performance, multi-scale features from different encoder levels are combined to leverage the hierarchical representation of the input image. For segmentation, the features passed to the decoder via skip connections are refined using a novel dilated feature enhancement (DFE) module, which captures information at different scales through three parallel convolution branches with varying dilation rates. This allows the decoder to detect lesions of varying sizes with greater accuracy. Experimental results across multiple medical datasets confirm the superior performance of our model in both segmentation and classification tasks, compared to state-of-the-art single-task and multi-task learning methods. These findings highlight the potential of our approach to advance disease diagnosis and treatment through improved medical image analysis. The code will be available at https://github.com/nguyenpbui/ResFormer .

Original languageEnglish
Article number103338
JournalArtificial Intelligence in Medicine
Volume173
DOIs
StatePublished - Mar 2026

Keywords

  • Attention mechanism
  • Convolutional neural networks
  • Dilated blocks
  • Medical image classification
  • Medical image segmentation
  • Multi-task learning
  • Transformer

Fingerprint

Dive into the research topics of 'Multi-scale feature enhancement in multi-task learning for medical image analysis'. Together they form a unique fingerprint.

Cite this