Saturday January 19, 2019
Home Lead Story Google AI can...

Google AI can focus on individual speakers in a crowd

The visual signal not only improves the speech separation quality significantly in cases of mixed speech, but, importantly, it also associates the separated, clean speech tracks

0
//
Google india launches 'Tz' to help people pay their utility bills. Wikimedia Commons
Google AI to identify speakers from crowd. Wikimedia Commons

Just as most smartphone cameras now allow users to focus on a single object among many, it may soon be possible to pick out individual voices in a crowd by suppressing all other sounds, thanks to a new Artificial Intelligence (AI) system developed by Google researchers.

This is an important development as computers as not as good as humans at focusing their attention on a particular person in a noisy environment. Known as the cocktail party effect, the capability to mentally “mute” all other voices and sounds comes natural to us humans.

Google has collaborated with getty images. Wikimedia Commons
Google AI will identify individual speakers now. Wikimedia Commons

However, automatic speech separation — separating an audio signal into its individual speech sources — remains a significant challenge for computers, Inbar Mosseri and Oran Lang, software engineers at Google Research, wrote in a blog post this week. In a new paper, the researchers presented a deep learning audio-visual model for isolating a single speech signal from a mixture of sounds such as other voices and background noise.

“In this work, we are able to computationally produce videos in which speech of specific people is enhanced while all other sounds are suppressed,” Mosseri and Lang said. The method works on ordinary videos with a single audio track, and all that is required from the user is to select the face of the person in the video they want to hear, or to have such a person be selected algorithmically based on context.

Also Read: Want To Know What Facebook, Google Know About You?

The researchers believe this capability can have a wide range of applications, from speech enhancement and recognition in videos, through video conferencing, to improved hearing aids, especially in situations where there are multiple people speaking. “A unique aspect of our technique is in combining both the auditory and visual signals of an input video to separate the speech,” the researchers said.

google
This will also help in speech enhancement . VOA

“Intuitively, movements of a person’s mouth, for example, should correlate with the sounds produced as that person is speaking, which in turn can help identify which parts of the audio correspond to that person,” they explained.

The visual signal not only improves the speech separation quality significantly in cases of mixed speech, but, importantly, it also associates the separated, clean speech tracks with the visible speakers in the video, the researchers said. IANS

Next Story

Google’s Censored China Search Engine Project Triggers Protests

Several Google employees, including former Senior Scientist Jack Poulson, resigned in September, citing lack of corporate transparency in the wake of the censored search engine project

0
Google, smart compose
The Google name is displayed outside the company's office in London, Britain. VOA

Google’s offices in the US, UK, Canada, India, Mexico, Chile, Argentina, Sweden, Switzerland, and Denmark witnessed renewed protests by human rights groups over its plan to re-enter China through a censored search application code-named “Project Dragonfly”.

The demonstrations were organised by coalition of Chinese, Tibetan, Uighur, and human rights groups outside the tech giant’s offices. The Tibetan advocacy groups that were protesting included Free Tibet and the International Tibet Network.

“They fear that a censored search engine would lead to further oppression of the Tibetans, as filtered searches would erase terms such as ‘Tibet’ and ‘Tiananmen Square’ in line with the official narrative of the Chinese Communist Party,” the Business Insider reported late on Friday.

The same concerns apply to the Chinese citizens, including other oppressed minorities such as Uighur Muslims and Southern Mongolian people, the report added.

Google, Main One, russia, smart compose
A Google logo is seen at the company’s headquarters in Mountain View, California, VOA

The Internet giant designed a censored version for China search engine to blacklist information about human rights, democracy, peaceful protest, and religion in accordance with strict rules on censorship in the country that are enforced by its Communist Party government.

The dispute began in August 2018 when reports surfaced that Google staffers working on “Project Dragonfly” had been using a Beijing-based website to help develop blacklists for the censored search engine, which was designed to block out broad categories of information related to democracy, human rights, and peaceful protest, according to The Intercept.

Also Read- In the Name of Kabaddi, Punjab Youth Stay Back in Canada

Several Google employees, including former Senior Scientist Jack Poulson, resigned in September, citing lack of corporate transparency in the wake of the censored search engine project.

In December, Google was forced to shut down a data analysis system that it was using to develop the search engine and the teams working on “Project Dragonfly” stopped gathering search queries from mainland China. (IANS)