The auditory system is a complex and fragile network of structures working together to perceive, process, and encode sound. Deciphering different phonetic sounds and classifying those units has been an effort many researchers have worked on. However, Dr. Dematties in the research article “Phonetic acquisition in cortical dynamics, a computational approach” examines how linguistic units like phonemes are encoded and classified to form complex acoustic streams in speech data. Additionally, infants can differentiate sounds of words from a complex audio stream through recognizing patterns in speech. The research is accomplished through creating 500 words with different sounds and lengths, CSTM, which stimulates cortical tissues to mimic the respected sounds. They use multiple processes that stimulate the growth of the distal dendritic branch synapsis, thus, allowing for synapses to only be created on pyramidal cells and biases the process of activation in each respective neuron. This approach allows the researcher to control variations in levels of reverberation, noise, and pitch. Furthermore, multiple algorithms are used to activate auditory neurons in order to create the correct phonemes, words, and sounds that can be encoded. The research concludes that through the use of computational simulation, the neurophysiological and neuroanatomical data of the human auditory pathway is able to mimic incidental phonetic acquisition observed in human infants, which is a key mechanism involved during early language learning. The authors propose that these algorithms can be used in creating more efficient and complex AI speech generators and programs that recognize or translate speech.
Through the utilization of new technology and AI algorithms such as the ones Dematties produced, Neurologist can create brain-computer interfaces (BCI) for mute people that translate neurological and cortical language signals into electro-stimulation produced synthetic speech. Dematties work could accompany this research to achieve the same results in deaf patients as well. To achieve this, Anumanchipalli et al used an approach that used a two-stage decoding approach. Neural signals are translated into representations of movements of vocal-tract articulators into spoken sentences through the use of recurrent neural networks (RNN) and an electrocorticography (ECoG) device. Using a two-stage approach resulted in less acoustic distortion than using direct decoding of acoustic features. The authors argue that “If massive data sets spanning a wide variety of speech conditions were available, direct synthesis would probably match or outperform a two-stage decoding approach” (Pandarinath et al. 2019). Due to the creation of these algorithms by Dematties, direct synthesis is a greater possibility with the utilization of AI in speech and auditory processing.
Furthermore, due to the development of BCIs, through the use of AI and computational analytical algorithms, new forms of utilization of this technology have been considered for the control of arm and hand movements and in humans with paralysis. Trials have successfully demonstrated that the rapid communication, control of robotic arms, and restoration of sensation and movement of paralyzed limbs in humans using these BCIs is possible.
References
Dematties D, Rizzi S, Thiruvathukal GK, Wainselboim A, Zanutto BS (2019) Phonetic acquisition in cortical dynamics, a computational approach. PLoS ONE 14(6): e02117966. https://doi.org/10.1371/journal.pone.0217966
Anumanchipalli, G.K., Chartier, J. & Chang, E.F. “Speech synthesis from neural decoding of spoken sentences”. Nature 568, 493–498 (2019). https://doi.org/10.1038/s41586-019-1119-1
No comments:
Post a Comment