Researchers at MIT, Cornell, and McGill University created a new Machine Learning Model that, on its own discovers Linguistic Rules that often match up with those created by Human Experts

The capacity of humans to develop theories about the world is a fundamental feature of intelligence. The recorded history of science is where this ability is most visibly displayed, yet it also appears in more subtle ways in daily cognition and during childhood development. Creating techniques to comprehend—and potentially even automate—the process of theory development is a fundamental objective for both artificial intelligence and computational cognitive science.

For a long time, linguists believed it would be challenging to educate a machine to analyze speech sounds and word patterns like humans. However, scientists from MIT, Cornell University, and McGill University have already made progress in this area. They have proved the capability of an AI system to teach itself the grammar and phonological structures of a human language.

The machine-learning model develops rules that explain why the forms of those words vary when given words and instances of how those words change to communicate different grammatical functions (like tense, case, or gender) in one language. To get better outcomes, this model can also automatically learn higher-level linguistic patterns that apply to many other languages.

58 different languages were used in problems from linguistics textbooks that the researchers used to train and evaluate the model. Each issue contained a specific set of words and related word modifications. For 60% of the issues, the model provided the appropriate rules to represent those word-form alterations.

This approach could be used to research linguistic hypotheses and look into minute variations in word meanings across many languages. It is particularly special because the system learns models using little bits of data, like a few dozen words, that people easily understand. Additionally, the system uses numerous tiny datasets rather than a single large one. This is closer to how scientists propose hypotheses: to look at numerous related datasets and develop models to explain phenomena across those datasets.

The researchers chose to investigate the relationship between phonology and morphology in their endeavor to create an AI system that could automatically train a model from numerous related datasets.

Because many languages share similar core characteristics and textbook exercises highlight certain linguistic phenomena, data from linguistics textbooks made for an excellent testbed. College students can also handle textbook problems quite simply, but they often have a prior understanding of phonology from previous lectures they draw on while thinking about new difficulties.

The researchers utilized a machine-learning method called Bayesian Program Learning to create a model that could learn grammar or a set of rules for putting words together. Using this method, the model creates a computer program to address a challenge.

The program, in this instance, is the grammar that the model believes to be the most plausible means of explaining the words and their meanings in a linguistics problem. They created the model using Sketch, a well-known software synthesizer created by Solar-Lezama at MIT.

The researchers utilized a machine-learning method called Bayesian Program Learning to create a model that could learn grammar or a set of rules for putting words together. Using this method, the model creates a computer program to address a challenge.

The program, in this instance, is the grammar that the model believes to be the most plausible means of explaining the words and their meanings in a linguistics problem. They created the model using Sketch, a well-known software synthesizer created by Solar-Lezama at MIT.

Additionally, the model was tested to see whether it could learn some universal phonological rule templates that could be used for all issues.

The researchers hope to apply this concept in the future to solve unforeseen issues in several other fields. They could also use the method in more circumstances when applying advanced knowledge across related datasets is possible. 

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'Synthesizing theories of human language with Bayesian program induction'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and reference article.

Please Don't Forget To Join Our ML Subreddit


Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.


Credit: Source link

Comments are closed.