Skip to main content

Chest X-ray Foundation Models: A Survey and Future Directions

Recent years have seen an increase in the development of foundation models for chest X-ray (CXR) analysis. Such foundation models provide robust, generalizable feature extraction abilities and allow adaptation for a wide range of downstream tasks. However, there remains no survey paper in the literature that compiles the technologies of foundation models for CXR analysis. This survey aims to fill this gap by providing a comprehensive review of both vision foundation models (VFMs) and vision-language foundation models (VLFMs) for CXR analysis. Specifically, we compiled a list of recently-developed, high-performance CXR foundation models, discussed the commonly-used pretraining techniques and datasets, compared model architectures and parameters, analyzed downstream adaptation methods and corresponding tasks, and summarized model performance on common datasets for downstream tasks. Based on this thorough summary of CXR foundation models, we further highlighted their limitations, open challenges, and development trends. Finally, we discussed interesting future directions, from reasoning capability and agentic workflow to efficiency and interpretability, on CXR foundation model research, which would spawn next-generation artificial intelligence (AI) models for computer-aided CXR interpretation.

Reference

A. Wang, Z. Yang, S. Rais-Bahrami, P. Yan, "Chest X-ray Foundation Models: A Survey and Future Directions ,"

Meta-Radiology, vol. 4, issue 1, March 2026, article 100203.