Your Source of Innovation in the Medical Field

Tokyo 2020: New Speech Translator

by Brian Beary

December 7, 20193 minsUpdated on July 7, 2021

As Japan gears up to host the 2020 Summer Olympics, a surge in visitors is expected, up from 28 million in 2017 to 40+ million the year of the games. Some of them will inevitably fall ill or become injured and will seek medical treatment. Wherever they come from, they most likely will be unable to converse in Japanese. This was a consideration for Fujitsu as it developed a hands-free, wearable speech translation device designed for use in hospital settings.

The Japanese Information Technology giant designed the product in partnership with Japan’s National Institute of Information and Communications Technology (NICT) and the University of Tokyo Hospital. The device is currently on trial in a medical institution in Japan, deployed for conversations between patients and doctors as Masanao Suzuki, director of the Digital Transformation Group at Fujitsu Laboratories, explained. The company aims to move to full commercialization in the coming months.

When asked what the biggest nuts to crack during the R&D phase were, Suzuki said:

“It was challenging to develop the technology to detect speech as well as the technology able to identify the speaker’s direction, based on the voice of the user. It was especially challenging to achieve high detection accuracy in noisy environments such as hospital reception areas and laboratories.”

Filtering out Background Noise

While the human brain is remarkably deft at filtering out background noise, anyone who has ever used a voice recorder knows how electronic devices often struggle to do so. And in a hospital setting, background noise is ever-present: chatter in reception areas, the whirring of fans, the beep of heart monitors, the click-clack of shoes down a corridor, the ping of an elevator, the list is endless.

Speech translation device (Courtesy of Fujitsu)

To overcome this challenge, the hands-free wearable device uses a technology that involves creating an L-shaped sound channel that dampens sounds from directions other than the target direction. Fujitsu says the device has attained a 95 percent speech accuracy rate for conversations between healthcare providers and patients at a distance of about 80 cm.

Asked what makes their product different from others on the market, Suzuki said:

“Although previous devices required users to push buttons to begin and end translations and switch languages, our device features a hands-free function that does not require button operations.”

For now, the product works with three languages—English, Japanese, and Chinese. As for the addition of other languages, Suzuki said:

“This depends on the NICT as they developed the speech translation engine but I have heard that they plan to expand it to other Asian languages, as many Asians are expected to visit Japan for the Olympics.”