[综述笔记]Federated learning for medical image analysis: A survey

news2025/7/7 11:09:06

论文网址：Federated learning for medical image analysis: A survey - ScienceDirect

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用

1. 省流版

1.1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.2.1. Related surveys

2.2.2. Searching and analysis process

2.3. Background

2.3.1. Motivation

2.3.2. Problem formulation of federated learning

2.3.3. Typical process of federated learning

2.3.4. Types of federated learning

2.4. Federated learning for medical image analysis

2.4.1. Methods overview: A system perspective

2.4.2. Client-end learning

2.4.3. Sever-end learning

2.4.4. Client–server communication

2.5. Software platforms and tools

2.6. Medical image datasets for federated learning

2.6.1. Medical image data usage overview

2.6.2. Brain images

2.6.3. Chest/lung/heart images

2.6.4. Skin images

2.6.5. Others

2.7. Experiment

2.7.1. Experimental setup

2.7.2. Result and analysis

2.8. Discussion

2.8.1. Challenges of federated learning for medical image analysis

2.8.2. Future research directions

2.9. Conclusion

4. Reference

1. 省流版

1.1. 心得

（1）不太了解联邦学习在医学影像应用的可以读，不然不算是非常有深度的文章。可以当看个小科普。熟悉联邦学习的估计在这里面找不到新的idea和文章

2. 论文逐段精读

2.1. Abstract

①Conundrum in machine learning (ML): scarcity of samples

②Solving method: federated learning (FL)

③3 aspects of FL: client end, server end, and communication techniques

2.2. Introduction

①Privacy which limits the sharing between sites: Health Insurance Portability and Accountability Act (HIPAA) and General Data Protection Regulation (GDPR)

②FL in medical image field:

intractable adj. 棘手的；很难对付(或处理)的

2.2.1. Related surveys

①Papers covered: from 1 January 2017 to 31 October 2023

②Chapters/content included: Software Platforms, Experimental Study, Future Direction and New Arisen Problems, Different Perspective and Organization

2.2.2. Searching and analysis process

①Introducing how the selected papers and the chapters arrangement of this work

②Collected/contained papers:

2.3. Background

2.3.1. Motivation

（1）Privacy protection in medical image analysis

①Violating privacy protection laws may result in high fines for hospitals and institutions

（2）Challenges of medical image analysis

①Limited sites/data

②Class imbalance

③Population statistics and distribution differences

skew v.歪斜；偏离；歪曲；曲解；影响…的准确性；使不公允 n.斜交；扭曲；斜砌石；歪轮 adj.弯曲的；歪的；曲解的；误用的

2.3.2. Problem formulation of federated learning

①There are $N$ sites with there own datasets $\left \{D_{1},D_2, ... , D_N \right \}$

②Defining the sharing model as $M^*$ and the local model as $M$

2.3.3. Typical process of federated learning

（1）Client Selection and Initialization: including specific clients

（2）Local Training: advanced local training

（3）Model Upload: only update/share weights to sever

（4）Aggregation: different strategy

（5）Broadcast: synchronous update

（6）Iteration and Convergence: synchronous update

（7）Deployment: compatibility with existing hospital systems, integration challenges, and user adoption hurdles

2.3.4. Types of federated learning

（1）Horizontal federated learning

①Equals to homogeneous federated learning

②Different medical institutions hold medical imaging data of different patients

（2）Vertical federated learning

①Namely heterogeneous federated learning

②Different institutions have different types of data for the same group of patients

2.4. Federated learning for medical image analysis

2.4.1. Methods overview: A system perspective

①Overview:

2.4.2. Client-end learning

（1）Client end: Domain shift among clients

①Distribution in different sites:

②Domain-specific learning: fine-tune global model by data in clients

③Domain adaptation: align distribution differences

④Image harmonization: image-to-image translation model

（2）Client end: Limited data and labels

①Constract learning: distinguish between similar and dissimilar data

②Multi-task learning: data augmentation

③Weakly-supervised learning

④Knowledge distillation: utilizing small student model to represent the bigger teacher model

⑤Data synthesis: generative model

（3）Client end: Heterogeneous environments (computation resource & data scale)

①Each sites needs different epoch to train cuz their scale of samples and the device are different:

2.4.3. Sever-end learning

（1）Sever end: Weight aggregation

（2）Sever end: Domain shift among clients

（3）Sever end: Client corruption/anomaly detection

2.4.4. Client–server communication

（1）Data leakage and attack

①Partial Weights Sharing: only employ feature extraction in local model ()

②Differential Privacy: noise added

③Attack and Defense: generate fake images

（2）Communication efficiency

①Dynamic weight aggregation

2.5. Software platforms and tools

（1）PySyft

（2）OpenFL

（3）PriMIA

（4）Fed-BioMed

2.6. Medical image datasets for federated learning

2.6.1. Medical image data usage overview

①Using different site directly

②Divided one dataset to several sub dataset

2.6.2. Brain images

（1）ADNI

（2）ABIDE

（3）BraTS

（4）RSNA brain CT

（5）UK Biobank

（6）IXI

2.6.3. Chest/lung/heart images

（1）CheXpert

（2）ChestX-ray

（3）COVID-19 Chest X-ray

（4）COVIDx

（5）ACDC

（6）M&M

2.6.4. Skin images

（1）HAM10000

（2）ISIC

2.6.5. Others

（1）Eye: Kaggle Diabetic Retinopathy (Retina)

（2）Abdomen: PROMISE12

（3）Histology: TCGA

（4）Knee: fastMRI

（5）MedMNIST

2.7. Experiment

①Dataset: T1 weight image of ADNI

ADNI dataset contained	Total samples	AD	NC
ADNI 1	428	177	229
ADNI 2	360	159	201

②Atkas: AAL 90

③Feature: mean gray matter volumes

④Dataset dividing: 80% for training and 20% for testing

⑤Cross validation: 5 fold

2.7.1. Experimental setup

①Compared models

Traditional machine learning	Cross	train data on one client, test on other client
	Single	train and test on each client respectively
	Mix	train data from all the client, and then test
Popular FL methods	FedAVG	aggregate weights
	FedSGD	aggregate gradients
	FedProx	Every client trains its own model with an additional proximal term (the coefficient μ is set to 0.1)