A New Prompting Method Called SwitchPrompt Retrieves Domain-Specific Knowledge from Pre-Trained Language Models LMs
Recent studies on prompting pre-trained language models (LMs) have produced outstanding outcomes in various natural language processing tasks. However, the same cannot be said about the results for low-resource domains. Most publicly accessible LMs have been trained on data from general domains, such as Wikipedia or the BooksCorpus. Thus, applying them to downstream tasks results in a domain gap. The availability of large-scale text data is abundant in the general domain. In contrast, low-resource domains do not share this advantage, making the development of domain-specific LMs considerably more difficult. Also, it’s possible that training different models for each and every new domain isn’t the most computationally efficient course of action. Domain-specific LMs might not get enough domain-oriented instruction using conventional prompting strategies, even with domain-specific texts and adequate computational resources. This is because domain-specific knowledge is frequently represented by a broad general vocabulary that lacks domain-specific “tokens.” As a result, especially in low-resource environments, prompting LMs from the general and special domains may be ineffective.
These difficulties served as the impetus for Bosch researchers to create SwitchPrompt, an innovative and lightweight prompting methodology for domain-specific prompts. SwitchPrompt attempts to efficiently extract domain-specific information from LMs pre-trained on datasets from the general domain. It neither requires pre-training domain-specific LMs nor fine-tuning LMs for the downstream task. According to several studies the researchers ran on benchmark datasets from several areas, SwitchPrompt beats current state-of-the-art prompting techniques. It was also discovered that it is especially suitable for low-resource settings wherein little data and computational resources are available.
SwitchPrompt provides domain-oriented prompting by adding a series of vectors encoding domain-specific keywords to the sequence of soft prompting vectors. The team’s proposed prompts were developed to enable the model to dynamically transition between a general-domain prompt and a domain-specific question in order to obtain various types of knowledge from the pre-trained LM, depending on the input. They used gates to implement this dynamic switching. The research team believes this illustrates how their methodology effectively extracts domain-specific information from pre-trained LMs.
The team used a number of benchmark classification datasets from various domains. Some of the examples include question classification from the general and clinical domains, experiment classification from the materials science domain, etc. The researchers went one step further and even looked into very low resource settings by creating their own few-shot datasets using random sampling. When it comes to their selected model, the researchers made use of several open-sourced HuggingFace language models.
SwitchPrompt effectively bridges domain gaps between pre-training and downstream task data, enhancing in- and out-of-domain performance. A few-shot experiment on three text classification benchmarks shows the effectiveness of the general-domain pre-trained language models when employed with SwitchPrompt. With a 10.7% improvement in performance accuracy, general domain LMs with SwitchPrompt surpass their domain-specific competitors trained using state-of-the-art prompting techniques. These results are a clear indication of the fact that SwitchPrompt effectively reduces the need for domain-specific language model pre-training.
In conclusion, Bosch researchers proposed SwitchPrompt, a unique approach for efficiently prompting pre-trained LMs in low-resource settings. SwitchPrompt greatly narrows the performance gap between broad-domain and domain-specific language models. The foundation of their approach lies in the use of domain-specific keywords and gates, which allow the LM to dynamically retrieve domain-specific knowledge. When it comes to future work, Bosch researchers plan on investigating the effects on mixed-domain datasets and tasks involving sequence labeling.
Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 14k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more about the technical field by participating in several challenges.
Credit: Source link
Comments are closed.