Abstract
Nearly two billion chest X-rays (CXRs) are performed annually, making them the most used imaging technique in radiology for the diagnosis of pulmonary disorders. The accompanying report with the findings from a chest X-ray forms a crucial part of the examination. By providing an accurate report, healthcare professionals can be enabled to make better decisions about the care being provided. To this end, we propose an end-to-end radiology report generation framework built on transformers trained on text reports in conjunction with visual characteristics of the chest X-ray to generate a reliable report that astutely describes the findings from a single CXR taken either from the Anterior-Posterior or Posterior-Anterior position. A foundation model is utilised to perform Knowledge Distillation (KD) in conjunction with the Encoder which is fine-tuned during the training phase. In addition, using a large corpus of radiology reports to pre-train the foundation model in an unsupervised manner is shown to improve the performance on smaller datasets. This training methodology results in comparable performance to architectures that employ a lot more parameters. The proposed framework is evaluated on multiple datasets including the Indiana University dataset, MIMIC dataset, MIMIC-PRO dataset, and BRAX dataset. The incorporation of KD results in an increase of BLEU-1 score for Indiana dataset by 4% and BERTScore by 7.5%. Similarly, pre-training on larger datasets in combination with KD, further increases BLEU-1 score for Indiana dataset by 7.2% and BERTScore by 3%. For MIMIC dataset, comparable performance is achieved for the Findings and the Impression sections of the report while the proposed framework outperforms other techniques when both of these sections are combined. For MIMIC-PRO dataset, an semb score of 0.4069 while a RadGraph F1 score of 0.1165 is achieved outperforming other techniques in the literature. Finally, the proposed framework is also evaluated on locally gathered dataset and BRAX subset without any re-training or fine-tuning resulting in BLEU-1 score of 0.3827 and a BERTScore of 0.4392 for the former and BLEU-1 score 0.1671 of and a BERTScore of 0.2186 for latter showing generalisation ability.
| Original language | British English |
|---|---|
| Article number | 108340 |
| Journal | Biomedical Signal Processing and Control |
| Volume | 111 |
| DOIs | |
| State | Published - Jan 2026 |
Keywords
- Knowledge distillation
- Medical imaging
- Pre-training
- Report generation
Fingerprint
Dive into the research topics of 'Radiology report generation from a singular perspective using transformers with Knowledge Distillation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver