Technical editor: fly Beijing SegmentFault public | think no number: SegmentFault
PaddlePL is an open source federated learning framework based on PaddlePaddle. Researchers can easily copy and compare different federated learning algorithms using paddlePL. Developers can also benefit from Padderfl because it is easy to deploy federated learning systems in large distributed clusters using PaddlePL.
PaddlePL provides many federated learning strategies and their applications in computer vision, natural language processing, recommendation algorithms, and more. In addition, PaddlePL will also provide applications of traditional machine learning training strategies, such as multi-tasking and transfer learning in federated learning environments. Thanks to PaddlePaddle’s large-scale distributed training and Kubernetes’ ability to flexibly schedule training tasks, PaddlePL can be easily deployed based on full-stack open source software.
The federal study
Today, data is becoming more expensive and it is difficult to share raw data across organizations. Joint learning aims to solve the problem of data isolation and data knowledge security sharing among organizations. The concept of federated learning was developed by researchers at Google.
In PADDLEPL, horizontal and vertical federated learning strategies are implemented based on the given classification. PaddlePL will also provide examples of applications in areas such as natural language processing, computer vision, and recommendation algorithms.
Federated learning strategy
- Vertical federated learning: Logical regression with PRIVC, neural network with third party PRIVC 
- Horizontal federated learning: federated averaging, differential privacy
- The migration study
- Active learning
Paddlefl frame design
In PaddeFL, the components used to define federated learning tasks and federated learning training work are as follows:
- FL-Strategy: Users can use FL-Strategy to define federated learning policies, such as FED-AVG .
- User-defined-program: PaddlePaddle’s programs define machine learning model structures and training strategies, such as multitasking.
- Distributed-Config: In federated learning, the system is deployed in a Distributed environment. Distributed training configuration defines distributed training node information.
- FL-Job-Generator: Given Fl-Strategy, User-defined Program and Distributed Training Config, fl-jobs on the Server side and Worker side of federated parameters will be generated through FL Job Generator. FL-JOBS are sent to the organization and federated parameter servers for joint training.
- FL-Server: A federated parameter Server running in the cloud or a third-party cluster.
- Fl-Worker: Each organization participating in federated learning will have one or more workers that communicate with the federated parameter server.
- Fl-Scheduler: It plays the role of scheduling workers during the training process and determines which workers can participate in the training before each update cycle.
Installation guide and quick start
Refer to Quick Start.
Kubernetes is easy to deploy
kubectl apply -f ./paddle_fl/examples/k8s_deployment/master.yaml
Refer to the K8S deployment example
You can also refer to the K8S cluster application and Kubectl installation to configure your own K8S cluster
The performance test
Gru4Rec  introduces a recursive neural network model in session-based recommendation. PaddlePaddle GRU4RC implementation code at https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/gru4rec. For an example of a Federated Learning based training Gru4Rec model refer to Gru4Rec in Federated Learning
The address of the project: https://github.com/PaddlePadd…