This Machine Learning Framework Collaborates Heterogeneous Natural Language Processing Tasks via Federated Learning

One of the key elements in the significant success of big machine learning models in various Natural Language Processing (NLP) applications is learning from the massive amount of data. However, the public’s growing privacy concerns and the tightening of data protection laws create barriers between data owners, making it more difficult (and often even forbidden) to gather and keep private data for training models centrally. Federated learning (FL) has been suggested to train models cooperatively using decentralized data in a privacy-preserving way, quickly gaining appeal in academia and business. FL is motivated by such privacy protection concerns.

The methodology outlined by FEDAVG is largely used in previous research on the adoption of federated learning for NLP applications: clients train the model based on local data separately and communicate their model changes to a server for federated aggregation. Using such an FL framework has various drawbacks for practical NLP applications. First, only participants with the same learning purpose can enroll in an FL course to train models collaboratively for federated learning. Second, the framework might not be suitable for those who wish to keep their learning purpose private due to privacy concerns or conflicts of interest. An agreement on the learning objectives needs to be achieved among participants beforehand under this framework.

These restrictions greatly restrict the adoption of FL in NLP applications since federated learning aims to connect disparate data islands rather than merely coordinating participants with the same learning objective. The ASSIGN-THEN-CONTRAST (abbreviated as ATC) FL framework, which enables participants with heterogeneous or private learning objectives to learn from shared information via federated learning, is the solution they suggest in this research to address these restrictions. 

Meet Hailo-8™: An AI Processor That Uses Computer Vision For Multi-Camera Multi-Person Re-Identification (Sponsored)

The suggested framework proposes a two-stage training paradigm for the built-in FL courses, which includes: 

(i) ASSIGN: In this phase, the server gives clients unified tasks for local training and broadcasting the most recent global models. To learn from local data without utilizing their learning objectives, clients can undertake local training using the tasks allocated to them. 

(ii) CONTRAST: To share important information, clients optimize a contrastive loss while doing local training by their particular learning objectives. To effectively use these model updates, the server strategically combines them based on the calculated distances between clients. They provide empirical analyses of a variety of Natural Language Understanding (NLU) and Natural Language Creation (NLG) tasks on six commonly used datasets, including text categorization, question answering, abstractive text summarization, and question generation.

The experimental findings show how well ATC works in assisting clients with diverse or private learning goals to participate in and profit from an FL course. Building FL courses using the suggested framework ATC results in noticeable gains for customers with various learning objectives compared to numerous baseline methodologies. One can try the platform on Google Colab. The code implementation is freely available on GitHub.


Check out the Paper and Github. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.


Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.


Credit: Source link

Comments are closed.