TY - JOUR
T1 - Predicting treatment response in multicenter non-small cell lung cancer patients based on federated learning
AU - Liu, Yuan
AU - Huang, Jinzao
AU - Chen, Jyh Cheng
AU - Chen, Wei
AU - Pan, Yuteng
AU - Qiu, Jianfeng
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Background: Multicenter non-small cell lung cancer (NSCLC) patient data is information-rich. However, its direct integration becomes exceptionally challenging due to constraints involving different healthcare organizations and regulations. Traditional centralized machine learning methods require centralizing these sensitive medical data for training, posing risks of patient privacy leakage and data security issues. In this context, federated learning (FL) has attracted much attention as a distributed machine learning framework. It effectively addresses this contradiction by preserving data locally, conducting local model training, and aggregating model parameters. This approach enables the utilization of multicenter data with maximum benefit while ensuring privacy safeguards. Based on pre-radiotherapy planning target volume images of NSCLC patients, a multicenter treatment response prediction model is designed by FL for predicting the probability of remission of NSCLC patients. This approach ensures medical data privacy, high prediction accuracy and computing efficiency, offering valuable insights for clinical decision-making. Methods: We retrospectively collected CT images from 245 NSCLC patients undergoing chemotherapy and radiotherapy (CRT) in four Chinese hospitals. In a simulation environment, we compared the performance of the centralized deep learning (DL) model with that of the FL model using data from two sites. Additionally, due to the unavailability of data from one hospital, we established a real-world FL model using data from three sites. Assessments were conducted using measures such as accuracy, receiver operating characteristic curve, and confusion matrices. Results: The model’s prediction performance obtained using FL methods outperforms that of traditional centralized learning methods. In the comparative experiment, the DL model achieves an AUC of 0.718/0.695, while the FL model demonstrates an AUC of 0.725/0.689, with real-world FL model achieving an AUC of 0.698/0.672. Conclusions: We demonstrate that the performance of a FL predictive model, developed by combining convolutional neural networks (CNNs) with data from multiple medical centers, is comparable to that of a traditional DL model obtained through centralized training. It can efficiently predict CRT treatment response in NSCLC patients while preserving privacy.
AB - Background: Multicenter non-small cell lung cancer (NSCLC) patient data is information-rich. However, its direct integration becomes exceptionally challenging due to constraints involving different healthcare organizations and regulations. Traditional centralized machine learning methods require centralizing these sensitive medical data for training, posing risks of patient privacy leakage and data security issues. In this context, federated learning (FL) has attracted much attention as a distributed machine learning framework. It effectively addresses this contradiction by preserving data locally, conducting local model training, and aggregating model parameters. This approach enables the utilization of multicenter data with maximum benefit while ensuring privacy safeguards. Based on pre-radiotherapy planning target volume images of NSCLC patients, a multicenter treatment response prediction model is designed by FL for predicting the probability of remission of NSCLC patients. This approach ensures medical data privacy, high prediction accuracy and computing efficiency, offering valuable insights for clinical decision-making. Methods: We retrospectively collected CT images from 245 NSCLC patients undergoing chemotherapy and radiotherapy (CRT) in four Chinese hospitals. In a simulation environment, we compared the performance of the centralized deep learning (DL) model with that of the FL model using data from two sites. Additionally, due to the unavailability of data from one hospital, we established a real-world FL model using data from three sites. Assessments were conducted using measures such as accuracy, receiver operating characteristic curve, and confusion matrices. Results: The model’s prediction performance obtained using FL methods outperforms that of traditional centralized learning methods. In the comparative experiment, the DL model achieves an AUC of 0.718/0.695, while the FL model demonstrates an AUC of 0.725/0.689, with real-world FL model achieving an AUC of 0.698/0.672. Conclusions: We demonstrate that the performance of a FL predictive model, developed by combining convolutional neural networks (CNNs) with data from multiple medical centers, is comparable to that of a traditional DL model obtained through centralized training. It can efficiently predict CRT treatment response in NSCLC patients while preserving privacy.
KW - Chemotherapy and radiotherapy
KW - Federated learning
KW - Non-small cell lung cancer
KW - Treatment response
UR - http://www.scopus.com/inward/record.url?scp=85195334444&partnerID=8YFLogxK
U2 - 10.1186/s12885-024-12456-7
DO - 10.1186/s12885-024-12456-7
M3 - Article
C2 - 38840081
AN - SCOPUS:85195334444
SN - 1471-2407
VL - 24
JO - BMC Cancer
JF - BMC Cancer
IS - 1
M1 - 688
ER -