A Cornell University researcher has developed sonar glasses that “hear” you without speaking. The eyeglass attachment uses tiny microphones and speakers to read the words you mouth as you silently command it to pause or skip a music track, enter a passcode without touching your phone or work on CAD models without a keyboard.
Cornell Ph.D. student Ruidong Zhang developed the system, which builds off a similar project the team created using a wireless earbud — and models before that which relied on cameras. The glasses form factor removes the need to face a camera or put something in your ear. “Most technology in silent-speech recognition is limited to a select set of predetermined commands and requires the user to face or wear a camera, which is neither practical nor feasible,” said Cheng Zhang, Cornell assistant professor of information science. “We’re moving sonar onto the body.”
The researchers say the system only requires a few minutes of training data (for example, reading a series of numbers) to learn a user’s speech patterns. Then, once it’s ready to work, it sends and receives sound waves across your face, sensing mouth movements while using a deep learning algorithm to analyze echo profiles in real time “with about 95 percent accuracy.”
The system does this while offloading data processing (wirelessly) to your smartphone, allowing the accessory to remain small and unobtrusive. The current version offers around 10 hours of battery life for acoustic sensing. Additionally, no data leaves your phone, eliminating privacy concerns. “We’re very excited about this system because it really pushes the field forward on performance and privacy,” said Cheng Zhang. “It’s small, low-power and privacy-sensitive, which are all important features for deploying new, wearable technologies in the real world.”
Privacy also comes into play when looking at potential real-world uses. For example, Ruidong Zhang suggests using it to control music playback controls (hands- and eyes-free) in a quiet library or dictating a message at a loud concert where standard options would fail. Perhaps its most exciting prospect is people with some types of speech disabilities using it to silently feed dialogue into a voice synthesizer, which would then speak the words aloud.
If things go as planned, you can get your hands on one someday. The team at Cornell’s Smart Computer Interfaces for Future Interactions (SciFi) Lab is exploring commercializing the tech using a Cornell funding program. They’re also looking into smart-glasses applications to track facial, eye and upper body movements. “We think glass will be an important personal computing platform to understand human activities in everyday settings,” said Cheng Zhang.