Federated Partially Supervised Learning with Limited Decentralized Medical Images
Nanqing Dong‚ Michaek Kampffmeyer‚ Irina Voiculescu and Eric Xing
Data government has played an instrumental role in securing the privacy-critical infrastructure in the medical domain and has led to an increased need of federated learning (FL). While decentralization can limit the effectiveness of standard supervised learning, the impact of decentralization on partially supervised learning remains unclear. Besides, due to data scarcity, each client may have access to only limited partially labeled data. As a remedy, this work formulates and discusses a new learning problem federated partially supervised learning (FPSL) for limited decentralized medical images with partial labels. We study the impact of decentralized partially labeled data on deep learning-based models via an exemplar of FPSL, namely, federated partially supervised learning multi-label classification . By dissecting FedAVG, a seminal FL framework, we formulate and analyze two major challenges of FPSL and propose a simple yet robust FPSL framework, FedPSL, which addresses these challenges. In particular, FedPSL contains two modules, task-dependent model aggregation and task-agnostic decoupling learning , where the first module addresses the weight assignment and the second module improves the generalization ability of the feature extractor. We provide a comprehensive empirical understanding of FSPL under data scarcity with simulated experiments. The empirical results not only indicate that FPSL is an under-explored problem with practical value but also show that the proposed FedPSL can achieve robust performance against baseline methods on data challenges such as data scarcity and domain shifts. The findings of this study also pose a new research direction towards label-efficient learning on medical images.