D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms

MICCAI 2024

1Indian Institute of Technology Delhi, 2AIIMS Delhi,

Fig. (a) and (b) depict false positive predictions by current teacher student models in cross-domain BCDM. Red boxes indicate ground truth, yellow boxes show Adaptive Teacher predictions, and green boxes indicate predictions from D-MASTER. As shown in (c), our approach effectively mitigates the domain gap and makes accurate predictions.

Abstract

We focus on the problem of Unsupervised Domain Adaptation (UDA) for breast cancer detection from mammograms (BCDM) problem. Recent advancements have shown that masked image modeling serves as a robust pretext task for UDA. However, when applied to crossdomain BCDM, these techniques struggle with breast abnormalities such as masses, asymmetries, and micro-calcifications, in part due to the typically much smaller size of region of interest in comparison to natural images. This often results in more false positives per image (FPI) and significant noise in pseudo-labels typically used to bootstrap such techniques. Recognizing these challenges, we introduce a transformerbased Domain-invariant Mask Annealed Student Teacher autoencoder (D-MASTER) framework. D-MASTER adaptively masks and reconstructs multiscale feature maps, enhancing the model’s ability to capture reliable target domain features. D-MASTER also includes adaptive confidence refinement to filter pseudo-labels, ensuring only high-quality detections are considered. We also provide a bounding box annotated subset of 1000 mammograms from the RSNA Breast Screening Dataset (referred to as RSNA-BSD1K) to support further research in BCDM. We evaluate D-MASTER on multiple BCDM datasets acquired from diverse domains. Experimental results show a significant improvement of 9% and 13% in sensitivity at 0.3 FPI over state-of-the-art UDA techniques on publicly available benchmark INBreast and DDSM datasets respectively. We also report an improvement of 11% and 17% on In-house and RSNA-BSD1K datasets respectively. To promote reproducible research and address the scarcity of accessible resources in BCDM, we will publicly release source code, and pre-trained D-MASTER model, along with RSNA-BSD1K annotations.

D-MASTER Architecture

We introduce D-MASTER, a transformer-based Domain-invariant Mask Annealed Student Teacher Autoencoder Framework for cross-domain breast cancer detection from mammograms (BCDM), integrating a novel mask-annealing technique and adaptive confidence refinement module. Unlike pretraining with mask autoencoders (MAEs) [12], leveraging massive datasets for training and then fine-tuning on smaller datasets, we present a novel learnable masking technique for the MAE branch that generates masks of different complexities, which are reconstructed by the DefDETR [44] encoder and decoder. Our approach, as a self-supervised task on target images, enables the encoder to acquire domain-invariant features and learn better target representations.

Mask Annealing Algorithm

Mask Annealing Algorithm (left) and Adaptive Confidence Refinement (right) flowchart depicts the gradual transition of confidence threshold from soft to hard.

Quantitative Results on Mammogram Datasets

Evaluation Metric

We use Free-Response Receiver Operating Characteristic (FROC) curves [8] for reporting our results. The curves provide a graphical representation of sensitivity/recall values at different false positives per image (FPI). We follow related works in this area [27] and consider a prediction as true positive if the center of the predicted bounding box lies within the ground-truth box.

Table 1 shows the comparative results with other domain adaptation techniques, including those proposed for natural images. Fig. (4) depicts corresponding FROC curves comparison with the nearest competitors only (to avoid clutter).

Quantitative Results

Qualitative result comparison on in-house, DDSM, and RSNA-BSD1K datasets. Red boxes show the ground truth, and blue boxes show the predictions.

Ablation Study

Ablation study to understand impact of each proposed module for In-house to INBreast adaptation. “Source” denotes the source-only trained model, “Baseline” the basic teacher-student architecture, “MA” the proposed mask annealing technique, and “ACR” denotes adaptive confidence refinement module. The figures from left to right correspond to qualitative results from row 1 to row 6 respectively. Red boxes denote the ground truth, and blue boxes show the predicted regions.

BibTeX

@article{ashraf2024dmastermaskannealedtransformer,
        title={D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms}, 
        author={Tajamul Ashraf and Krithika Rangarajan and Mohit Gambhir and Richa Gabha and Chetan Arora},
        year={2024},
        eprint={2407.06585},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2407.06585}, 
  }