Researchers Look to Develop Computerized Speech Recognition to 2,000 Languages

A crew of researchers at Carnegie Mellon College is trying to increase computerized speech recognition to 2,000 languages. As of proper now, solely a portion of the estimated 7,000 to eight,000 spoken languages around the globe would profit from trendy language applied sciences like voice-to-text transcription or computerized captioning.

Xinjian Li is a Ph.D. scholar within the Faculty of Pc Science’s Language Applied sciences Institute (LTI).

“Lots of people on this world converse numerous languages, however language expertise instruments aren’t being developed for all of them,” he stated. “Growing expertise and a superb language mannequin for all folks is without doubt one of the targets of this analysis.”

Li belongs to a crew of specialists trying to simplify the information necessities languages must develop a speech recognition mannequin.

The crew additionally contains LTI school members Shinji Watanabe, Florian Metze, David Mortensen and Alan Black.

The analysis titled “ASR2K: Speech Recognition for Round 2,000 Languages With out Audio” was introduced at Interspeech 2022 in South Korea.

A majority of the present speech recognition fashions require textual content and audio knowledge units. Whereas textual content knowledge exists for 1000’s of languages, the identical will not be true for audio. The crew needs to get rid of the necessity for audio knowledge by specializing in linguistic components which can be widespread throughout many languages.

Speech recognition applied sciences usually deal with a language’s phoneme, that are distinct sounds that distinguish it from different languages. These are distinctive to every language. On the similar time, languages have telephones that describe how a phrase sounds bodily, and a number of telephones can correspond to a single phoneme. Whereas separate languages can have totally different phonemes, the underlying telephones might be the identical.

The crew is engaged on a speech recognition mannequin that depends much less on phonemes and extra on details about how telephones are shared between languages. This helps cut back the trouble wanted to construct separate fashions for every particular person language. By pairing the mannequin with a phylogenetic tree, which is a diagram that maps the relationships between languages, it helps with pronunciation guidelines. The crew’s mannequin and the tree construction have enabled them to approximate the speech mannequin for 1000’s of languages even with out audio knowledge.

“We are attempting to take away this audio knowledge requirement, which helps us transfer from 100 to 200 languages to 2,000,” Li stated. “That is the primary analysis to focus on such a lot of languages, and we’re the primary crew aiming to increase language instruments to this scope.”

The analysis, whereas nonetheless in an early stage, has improved current language approximation instruments by 5%.

“Every language is an important consider its tradition. Every language has its personal story, and in the event you don’t attempt to protect languages, these tales could be misplaced,” Li stated. “Growing this sort of speech recognition system and this instrument is a step to attempt to protect these languages.”

Researchers Look to Develop Computerized Speech Recognition to 2,000 Languages

Must-read

Nvidia CEO reveals new ‘reasoning’ AI tech for self-driving vehicles | Nvidia

Tesla publishes analyst forecasts suggesting gross sales set to fall | Tesla

5 tech tendencies we’ll be watching in 2026 | Expertise

Recent articles

Nvidia CEO reveals new ‘reasoning’ AI tech for self-driving vehicles | Nvidia

Tesla publishes analyst forecasts suggesting gross sales set to fall | Tesla

5 tech tendencies we’ll be watching in 2026 | Expertise

Chinese language robotaxis due in London subsequent yr as Lyft and Uber reveal tie-ups | Self-driving vehicles

California regulator places on maintain an order to droop Tesla gross sales | California

Confirmed, Not Promised: Incomes Our Place on the Street

More like this

Nvidia CEO reveals new ‘reasoning’ AI tech for self-driving vehicles | Nvidia

Tesla publishes analyst forecasts suggesting gross sales set to fall | Tesla

5 tech tendencies we’ll be watching in 2026 | Expertise

Chinese language robotaxis due in London subsequent yr as Lyft and Uber reveal tie-ups | Self-driving vehicles

LEAVE A REPLY Cancel reply

About Us