Analysis might carry automated speech recognition to 2,000 languages — ScienceDaily

on

|

views

and

comments


Solely a fraction of the 7,000 to eight,000 languages spoken world wide profit from trendy language applied sciences like voice-to-text transcription, automated captioning, instantaneous translation and voice recognition. Carnegie Mellon College researchers wish to develop the variety of languages with automated speech recognition instruments out there to them from round 200 to probably 2,000.

“Lots of people on this world converse various languages, however language know-how instruments aren’t being developed for all of them,” mentioned Xinjian Li, a Ph.D. scholar within the College of Pc Science’s Language Applied sciences Institute (LTI). “Creating know-how and a superb language mannequin for all folks is likely one of the objectives of this analysis.”

Li is a part of a analysis crew aiming to simplify the info necessities languages have to create a speech recognition mannequin. The crew — which additionally contains LTI school members Shinji Watanabe, Florian Metze, David Mortensen and Alan Black — introduced their most up-to-date work, “ASR2K: Speech Recognition for Round 2,000 Languages With out Audio,” at Interspeech 2022 in South Korea.

Most speech recognition fashions require two knowledge units: textual content and audio. Textual content knowledge exists for 1000’s of languages. Audio knowledge doesn’t. The crew hopes to get rid of the necessity for audio knowledge by specializing in linguistic parts frequent throughout many languages.

Traditionally, speech recognition applied sciences give attention to a language’s phoneme. These distinct sounds that distinguish one phrase from one other — just like the “d” that differentiates “canine” from “log” and “cog” — are distinctive to every language. However languages even have telephones, which describe how a phrase sounds bodily. A number of telephones would possibly correspond to a single phoneme. So regardless that separate languages could have totally different phonemes, their underlying telephones could possibly be the identical.

The LTI crew is creating a speech recognition mannequin that strikes away from phonemes and as an alternative depends on details about how telephones are shared between languages, thereby decreasing the hassle to construct separate fashions for every language. Particularly, it pairs the mannequin with a phylogenetic tree — a diagram that maps the relationships between languages — to assist with pronunciation guidelines. By their mannequin and the tree construction, the crew can approximate the speech mannequin for 1000’s of languages with out audio knowledge.

“We are attempting to take away this audio knowledge requirement, which helps us transfer from 100 or 200 languages to 2,000,” Li mentioned. “That is the primary analysis to focus on such numerous languages, and we are the first crew aiming to develop language instruments to this scope.”

Nonetheless in an early stage, the analysis has improved present language approximation instruments by a modest 5%, however the crew hopes it’s going to function inspiration not just for their future work but additionally for that of different researchers.

For Li, the work means greater than making language applied sciences out there to all. It is about cultural preservation.

“Every language is a vital think about its tradition. Every language has its personal story, and if you happen to do not attempt to protect languages, these tales may be misplaced,” Li mentioned. “Creating this sort of speech recognition system and this device is a step to attempt to protect these languages.”

Story Supply:

Supplies supplied by Carnegie Mellon College. Unique written by Aaron Aupperlee. Observe: Content material could also be edited for model and size.

Share this
Tags

Must-read

Nvidia CEO reveals new ‘reasoning’ AI tech for self-driving vehicles | Nvidia

The billionaire boss of the chipmaker Nvidia, Jensen Huang, has unveiled new AI know-how that he says will assist self-driving vehicles assume like...

Tesla publishes analyst forecasts suggesting gross sales set to fall | Tesla

Tesla has taken the weird step of publishing gross sales forecasts that recommend 2025 deliveries might be decrease than anticipated and future years’...

5 tech tendencies we’ll be watching in 2026 | Expertise

Hi there, and welcome to TechScape. I’m your host, Blake Montgomery, wishing you a cheerful New Yr’s Eve full of cheer, champagne and...

Recent articles

More like this

LEAVE A REPLY

Please enter your comment!
Please enter your name here