Google research lets sign language switch active speaker in video calls

An aspect of video calls that many of us take for granted is the way they can switch between feeds to highlight whoevers speaking.

Silent speech like sign language doesnt trigger those algorithms, unfortunately, but this research from Google might change that.

Its a real-time sign language detection engine that can tell when someone is signing and when theyre done.

Of course its trivial for humans to tell this sort of thing, but its harder for a video call system thats used to just pushing pixels.

A new paper from Google researchers, presented at ECCV, shows how it can be done efficiency and with very little latency. It would defeat the point if the sign language detection worked but it resulted in delayed or degraded video, so their goal was to make sure the model was both lightweight and reliable.

The system first runs the video through a model called PoseNet, which estimates the positions of the body and limbs in each frame. This simplified visual information is sent to a model trained on pose data from video of people using German Sign Language, and it compares the live image to what it thinks signing looks like.

This simple process already produces 80 percent accuracy in predicting whether a person is signing or not, and with some additional optimizing gets up to 91.5 percent accuracy.

Right now its just a demo, which you can try here, but there doesnt seem to be any reason why it couldnt be built right into existing video call systems or even as an app that piggybacks on them.

Original article
Author: Devin Coldewey

TechCrunch is a leading technology media property, dedicated to obsessively profiling startups, reviewing new Internet products, and breaking tech news.

Devin Coldewey has recently written 7 articles on similar topics including :
  1. "Google is currently under fire for apparently pushing out a researcher whose work warned of bias in AI, and now a report from Reuters says others doing such work at the company have been asked to strike a positive tone and undergo additional reviews for research touching on sen". (December 24, 2020)
  2. "As part of new efforts towards accessibility, Google announced Project Euphonia at I/O in May: An attempt to make speech recognition capable of understanding people with non-standard speaking voices or impediments". (August 14, 2019)
  3. "Google Music is dead, and with it one of the few remaining connections I have to the company that doesnt feel like a gun to my head. The service, now merged haphazardly with YouTube Music, recalled the early days of Google, when they sometimes just made cool internet things". (November 12, 2020)
  4. "Apple has issued a tart response to an extensive report by Google of a serious security flaw in iOS. The flaw, which let an attacker gain root access to a device visiting a malicious website, was reported last week". (September 6, 2019)
  5. "Google has taken the wraps off Chimera Painter, a web-based tool that lets anyone generate terrifying cryptozoological entities in an interface that looks like MS Paint by way of Diablo. Why, you ask? Well, isnt it obvious? No no, I suppose it isnt". (November 18, 2020)
  6. "It's a bit strange to hear that the world's leading social network is pursuing research in robotics rather than, say, making search useful, but Facebook is a big organization with many competing priorities". (May 20, 2019)
  7. "Millions of people communicate using sign language, but so far projects to capture its complex gestures and translate them to verbal speech have had limited success. A new advance in real-time hand tracking from Google’s AI labs, however, could be the breakthrough some have been waiting for". (August 20, 2019)
Posted on  , , , ,