TY - JOUR
T1 - CervixFormer
T2 - A Multi-scale swin transformer-Based cervical pap-Smear WSI classification framework
AU - Khan, Anwar
AU - Han, Seunghyeon
AU - Ilyas, Naveed
AU - Lee, Yong Moon
AU - Lee, Boreom
N1 - Publisher Copyright:
© 2023
PY - 2023/10
Y1 - 2023/10
N2 - Background and Objectives: Cervical cancer affects around 0.5 million women per year, resulting in over 0.3 million fatalities. Therefore, repetitive screening for cervical cancer is of utmost importance. Computer-assisted diagnosis is key for scaling up cervical cancer screening. Current recognition algorithms, however, perform poorly on the whole-slide image (WSI) analysis, fail to generalize for different staining methods and on uneven distribution for subtype imaging, and provide sub-optimal clinical-level interpretations. Herein, we developed CervixFormer—an end-to-end, multi-scale swin transformer-based adversarial ensemble learning framework to assess pre-cancerous and cancer-specific cervical malignant lesions on WSIs. Methods: The proposed framework consists of (1) a self-attention generative adversarial network (SAGAN) for generating synthetic images during patch-level training to address the class imbalanced problems; (2) a multi-scale transformer-based ensemble learning method for cell identification at various stages, including atypical squamous cells (ASC) and atypical squamous cells of undetermined significance (ASCUS), which have not been demonstrated in previous studies; and (3) a fusion model for concatenating ensemble-based results and producing final outcomes. Results: In the evaluation, the proposed method is first evaluated on a private dataset of 717 annotated samples from six classes, obtaining a high recall and precision of 0.940 and 0.934, respectively, in roughly 1.2 minutes. To further examine the generalizability of CervixFormer, we evaluated it on four independent, publicly available datasets, namely, the CRIC cervix, Mendeley LBC, SIPaKMeD Pap Smear, and Cervix93 Extended Depth of Field image datasets. CervixFormer obtained a fairly better performance on two-, three-, four-, and six-class classification of smear- and cell-level datasets. For clinical interpretation, we used GradCAM to visualize a coarse localization map, highlighting important regions in the WSI. Notably, CervixFormer extracts feature mostly from the cell nucleus and partially from the cytoplasm. Conclusions: In comparison with the existing state-of-the-art benchmark methods, the CervixFormer outperforms them in terms of recall, accuracy, and computing time.
AB - Background and Objectives: Cervical cancer affects around 0.5 million women per year, resulting in over 0.3 million fatalities. Therefore, repetitive screening for cervical cancer is of utmost importance. Computer-assisted diagnosis is key for scaling up cervical cancer screening. Current recognition algorithms, however, perform poorly on the whole-slide image (WSI) analysis, fail to generalize for different staining methods and on uneven distribution for subtype imaging, and provide sub-optimal clinical-level interpretations. Herein, we developed CervixFormer—an end-to-end, multi-scale swin transformer-based adversarial ensemble learning framework to assess pre-cancerous and cancer-specific cervical malignant lesions on WSIs. Methods: The proposed framework consists of (1) a self-attention generative adversarial network (SAGAN) for generating synthetic images during patch-level training to address the class imbalanced problems; (2) a multi-scale transformer-based ensemble learning method for cell identification at various stages, including atypical squamous cells (ASC) and atypical squamous cells of undetermined significance (ASCUS), which have not been demonstrated in previous studies; and (3) a fusion model for concatenating ensemble-based results and producing final outcomes. Results: In the evaluation, the proposed method is first evaluated on a private dataset of 717 annotated samples from six classes, obtaining a high recall and precision of 0.940 and 0.934, respectively, in roughly 1.2 minutes. To further examine the generalizability of CervixFormer, we evaluated it on four independent, publicly available datasets, namely, the CRIC cervix, Mendeley LBC, SIPaKMeD Pap Smear, and Cervix93 Extended Depth of Field image datasets. CervixFormer obtained a fairly better performance on two-, three-, four-, and six-class classification of smear- and cell-level datasets. For clinical interpretation, we used GradCAM to visualize a coarse localization map, highlighting important regions in the WSI. Notably, CervixFormer extracts feature mostly from the cell nucleus and partially from the cytoplasm. Conclusions: In comparison with the existing state-of-the-art benchmark methods, the CervixFormer outperforms them in terms of recall, accuracy, and computing time.
KW - Cervical cancer
KW - Image classification
KW - Medical data augmentation
KW - Swin transformer
KW - WSI Analysis
UR - http://www.scopus.com/inward/record.url?scp=85164675420&partnerID=8YFLogxK
U2 - 10.1016/j.cmpb.2023.107718
DO - 10.1016/j.cmpb.2023.107718
M3 - Article
C2 - 37451230
AN - SCOPUS:85164675420
SN - 0169-2607
VL - 240
JO - Computer Methods and Programs in Biomedicine
JF - Computer Methods and Programs in Biomedicine
M1 - 107718
ER -