Revolutionizing Speech Restoration: Stanford-Led Research Unveils High-Performance Neuroprosthesis for Unconstrained Communication”
Speech brain-computer interfaces (BCIs) are a cutting-edge technological advancement with promising applications for rehabilitating individuals who lost the ability to communicate due to a disability. Decoding brain processes to enable communication of unrestricted phrases from a huge lexicon is still in its infancy, although early investigations have shown promise.
As a means of filling this void, a team of researchers from Stanford University, Washington University in St. Louis, the VA RR&D Center for Neurorestoration and Neurotechnology, Brown University, and Harvard Medical School recently presented a high-performance speech-to-text BCI that can process unconstrained sentences from a large vocabulary at a speed of 62 words/minute. This rate greatly exceeds the communication rates of conventional technologies for people with paralysis. Using brain activity recordings from the BrainGate2 pilot clinical trial, the team first examines how the motor cortex organizes orofacial movement and speech production. They found that all studied movements were strongly tuned in region 6v.
The researchers then looked at how the data for each movement was spread over area 6v, discovering that the dorsal array carried more information about orofacial movements but that the ventral array provided the most reliable speech decode rates. Despite this, 6v arrays offer a wealth of data on every type of motion. Finally, 3.2 3.2 mm2 arrays can adequately represent all voice articulators. Next, they examined whether or not they could neutrally parse full sentences in real-time. They use state-of-the-art voice recognition-inspired bespoke machine learning techniques to train a recurrent neural network (RNN) that excels with a minimum of neural data.
Using their data, the suggested method can correctly decode 92% of 50 words, 62% of 39 phonemes, and 92% of all orofacial movements. Furthermore, 62 words per minute are achieved while using the speech-to-text BCI. To sum up, consistent and spatially intermixed tuning to all examined movements shows that the representation of speech articulation is strong enough to sustain a speech BCI despite paralysis and limited coverage of the cortical surface. Area 6v recordings were used for further analysis because area 44 provided minimal data pertaining to speech production.
The capacity to talk and move can be severely compromised, if not lost entirely, in those with neurological illnesses such as brainstem stroke or amyotrophic lateral sclerosis. Paralyzed persons can now type between eight and eighteen words per minute using BCIs based on hand movement activity. Although they show great promise, speech BCIs have yet to attain excellent accuracy on large vocabularies, which would greatly accelerate their ability to restore natural communication. Using microelectrode arrays to record brain activity at single-neuron resolution, researchers developed a speech BCI that can parse unstretched sentences from a wide vocabulary (speed of 62 words per minute). This is the first time a BCI has been shown to deliver much faster communication rates than other technologies for the paralyzed.
This experiment demonstrates that it is possible to use neural spiking activity to decode attempts at speech, including a wide vocabulary. It should be noted, however, that the system still needs to be completed enough to be used in a clinical setting. There is still more work to make BCIs more user-friendly by minimizing the time to train the decoder and adapting to variations in brain activity over many days. In addition, more evidence of safety and effectiveness is needed before intracortical microelectrode arrays may be widely used in clinical settings. Furthermore, the decoding results demonstrated here need to be replicated in additional participants, and it is unclear whether or not they would apply to people with more severe orofacial paralysis. More research is required to confirm that regions of the precentral gyrus storing speech information can be reliably targeted across individuals with varying degrees of brain structure, which is a potential problem.
Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
If you like our work, you will love our newsletter..
Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.
Credit: Source link
Comments are closed.