Tuesday, October 20, 2020
Home Lead Story Google develops human-like text-to-speech AI

Google develops human-like text-to-speech AI

Google's engineers did not reveal much information but they left a big clue for developers to figure out how far they have come in developing this system.

  • Google is developing text-to-speech AI as an “AI First.”
  • It will also be able to mimic human voices.
  • Not much is revealed, but it can be sure to say that this could be a big success for Google.

In a major step towards its “AI first” dream, Google has developed a text-to-speech artificial intelligence (AI) system that will confuse you with its human-like articulation.

The tech giant’s text-to-speech system called “Tacotron 2” delivers an AI-generated computer speech that almost matches with the voice of humans, technology news website Inc.com reported.

At Google I/O 2017 developers conference, company’s Indian-origin CEO Sundar Pichai announced that the internet giant was shifting its focus from mobile-first to “AI first” and launched several products and features, including Google Lens, Smart Reply for Gmail and Google Assistant for iPhone.

Google's CEO, Sundar Pichai.
Google’s CEO, Sundar Pichai.

According to a paper published in arXiv.org, the system first creates a spectrogram of the text, a visual representation of how the speech should sound.

That image is put through Google’s existing WaveNet algorithm, which uses the image and brings AI closer than ever to in-discernibly mimicking human speech. The algorithm can easily learn different voices and even generates artificial breaths.

“Our model achieves a mean opinion score (MOS) of 4.53 comparable to a MOS of 4.58 for professionally recorded speech,” the researchers were quoted as saying.

On the basis of its audio samples, Google claimed that “Tacotron 2” can detect from context the difference between the noun “desert” and the verb “desert,” as well as the noun “present” and the verb “present,” and alter its pronunciation accordingly.

It can place emphasis on capitalised words and apply the proper inflection when asking a question rather than making a statement, the company said in the paper.

Meanwhile, Google’s engineers did not reveal much information but they left a big clue for developers to figure out how far they have come in developing this system.

According to the report, each of the ‘.wav’ file samples has a filename containing either the term “gen” or “gt.”

Based on the paper, it’s highly probable that “gen” indicates speech generated by Tacotron 2 and “gt” is real human speech. (“GT” likely stands for “ground truth,” a machine learning term that basically means “the real deal”.) IANS

STAY CONNECTED

19,120FansLike
362FollowersFollow
1,782FollowersFollow

Most Popular

1 in 3 Indian Call Centres to Permanently Switch to Work From Home

Signalling the future of work in the pandemic times, nearly one in three call centres (27 per cent) in India will switch permanently to...

Here’s How You Can Improve Both Physical and Cognitive Health

Daily exercise, along with nutrient-enriched beverages, can do wonder with improving both physical and cognitive health, researchers have discovered. While exercise alone improved strength and...

Microsoft Set to Release New AI-Based Noise Suppression Tool in Teams

As more and more people work from home and at times have no control over jarring sounds in the background, Microsoft is set to...

Why President Trump and PM Modi are so Fiercely Opposed by the Left and Islamists

By Maria Wirth “Í prefer Trump to Hillary”, I told a German friend in the run up to the US elections in 2016. There was...

Hackers Imitated Microsoft the Most in Q3 2020: Report

Hackers imitated Microsoft the most to lure people into giving up their personal data or payment credentials in the third quarter of this year,...

Clearing a Forest to Grow a Forest in Order to Overcome Delhi’s Pollution

By Rahul Kumar It is that time of the year again-when the weather is pleasant but the city is polluted. Air quality has shown a...

Marijuana May Help Reduce Lung Inflammation Linked to Covid-19 Death

After reporting earlier this summer that marijuana ingredient cannabidiol, or CBD, may help reduce cytokine storm and excessive lung inflammation linked to Covid-19 deaths,...

Uber Introduces Masks Verification Selfie Policy

Ride-hailing major Uber on Monday introduced a new safety policy which will request riders, who have been tagged for not wearing masks on a...

Recent Comments