IBM Open-Sources Label Sleuth: Allowing Users Without Machine Learning Knowledge to Create Unique Text Classification Models From Scratch

On Aug 10, 2022

Text analysis, including phrase completion, web translation, and text summaries produced by AI models, are becoming commonplace. But to adapt them to new tasks, fresh samples must normally be labeled by a domain expert, and a machine learning specialist must train the new model. However, there aren’t enough machine learning specialists to satisfy the rising need for a unique model.

A new IBM collaboration with Notre Dame and the University of Texas open-source application called Label Sleuth, making time-saving AI tools available to everyone. With this, individuals without prior experience in machine learning can create a unique text classification model from scratch.

Anyone tasked with digging through reams of text, from a lawyer looking for dangerous language in a contract to a historian looking for trends in a stack of records, can benefit from using Label Sleuth. Users may train a classifier to locate the words they’re looking for in a textual haystack by annotating the data with Label Sleuth.

Label Sleuth is made to be intuitive and rapidly pick up the given assignment. It can begin offering suggestions for improving the tagging process after receiving a few dozen examples of the text users want to isolate. Users can have a functional model in a matter of hours.

Once the model is operational, it is tweaked by humans and machines. The model instructs the user on the next text to label to maximize performance. Additionally, it indicates examples that may have been labeled wrongly so the user can review and possibly correct them. The model receives feedback from the user when it makes a mistake. Feedback is required less and less as time passes.

Academic scholars can manage their unlabeled text data more efficiently using Label Sleuth’s assistance. They also intend to incorporate the tool into trials to discover how to enhance active learning and classification algorithms and enhance human-machine interactions.

The first software library to incorporate algorithms for reading and replying to questions in more than 90 languages and manage question-answering difficulties contained in tables, images, and video.

A top-performing semantic parser called Abstract Meaning Representation (AMR), which converts text into a data structure that encapsulates the text’s meaning, was also updated by IBM. Software developers can build on top of the parser thanks to this data structure, which provides details about the persons, locations, and events referenced in the text and how they relate to one another. Applications can be used to assess the integrity of computer-generated summaries’ facts or to convert a query into a database query and receive a response.

References:

https://research.ibm.com/blog/label-sleuth
https://www.label-sleuth.org/

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.

Credit: Source link