You’re watching a show on Netflix late at night when you realize you want to grab a snack. The room is dark, so you look for the remote. Maybe you even pressed the wrong button to pause and ended up fast-forwarding instead.
Now imagine another way. Once you get up from the couch, the movie just stops. And once you sit down, it continues.
This is how the world would work seamlessly, if only the computers all around us could automatically understand our implicit intent, rather than our explicit mouse clicks, screen taps, and even voice commands. It’s a vision dreamed up by technologists for decades, sometimes called “ubiquitous computing” or “ambient computing” or even “silent computing.” And now Google is trying to make it happen by building computers that understand social norms around personal space.
“We’re really inspired by how people understand each other,” says Leonardo Giusti, design lead at Google’s ATAP (Advanced Technology & Projects) lab. “When you walk behind someone, they hold the door open for you. When you reach for something, it’s handed to you. As human beings, we often intuitively understand each other without saying a word.
At ATAP, design researchers have supercharged their Soli radar sensor to understand the social nuance embedded in our daily movements. Soli sees your body as nothing more than a drop. But through the right lens, this blob has inertia, posture, and gaze — all things we constantly assess when interacting with other people.
Soli first debuted as a way to track gestures in the air, and landed in Pixel phones to let you do things like scroll through songs in the air – technologically intriguing, but practically useless. Then Google integrated Soli into its Nest Hub, where it could track your sleep by detecting when you lay down, how much you tossed and turned, and even your breathing rate. It was a promising use case. But that was also exactly a use case for Soli, which was very situational and context dependent.
ATAP’s new demos take Soli’s capabilities to new extremes. Early iterations of Soli swept a few feet. Now it can scan a small room. Then, using a stack of algorithms, it can let a phone, thermostat, or Google smart display read your body language like another person would, to anticipate what you might like next.
“As this technology becomes more and more present in our lives, it’s only fair to ask technology to take inspiration from us a little more,” says Giusti. “The same way a partner would tell you to take an umbrella on the way out [the door on a rainy day]a thermostat near the door could do the same thing.
Google isn’t quite there yet, but it trained Soli to understand something key: “We imagined that people and devices can have personal space,” says Giusti, “and the overlap between those spaces can give a good understanding of the type of commitment and the social relationship between them, at a given time.
If I walked up to you, standing a yard away, and made eye contact, you would know I wanted to talk to you, so you would straighten up and acknowledge my presence. Soli can understand this, and the ATAP team proposes that a Google hub of the future could show the time I was remote, but show my email to wait for me to get closer and give it my attention .
How does he know how close enough is to matter? For this, the team exploits the social research of Edward T. Hall, who proposed the concept of Proxemics in the book The hidden dimension. With Proxemics, he was the first to propose the social context of the space around our bodies – that we view anything within 18 inches as personal space, anything within 12 feet as a social space to chat and anything beyond that to just be public. space to be shared by all without waiting.
ATAP codifies this research in its own software with some specific movements. It can understand that you are “approaching”, pulling UI elements that may have been off-screen earlier. But walking up to a device, and getting the device to recognize you, is a relatively simple task.
They trained their system to understand that if you walked near a screen without looking in its direction, you weren’t interested in seeing more of it. They called it “the pass”. And that would mean my Nest thermostat wouldn’t turn on every time I walked past it.
They also trained Soli to read which direction you’re facing, even if you were standing nearby. They call it “turning,” and it already has a killer use case for anyone who’s tried cooking a recipe from a YouTube clip.
“Let’s say you’re cooking in the kitchen, watching a tutorial for a new recipe. In this case, [turning] can be considered and maybe pausing a video when you pick up an ingredient and then resume when you’re back,” says Lauren Bedal, Head of Design at Google ATAP. “The kitchen is a great example, as your hands can get wet or messy.”
Finally, the system can tell if you are “watching”. Now the iPhone is already tracking your face to see if you look at it, unlock the screen, or automatically show messages in response. It absolutely counts as a glance, but ATAP is considering the gesture on a grander scale, like glancing at someone across the room. Bedal suggests that if you were talking on the phone and glancing across the room on your tablet, your tablet might automatically pop up the option to make it a video call. But I imagine the possibility is so much wider. For example, if your oven knew you were glancing at it, it might also know when you were talking to it. The glance could work much like with humans, signaling that someone has your sudden attention.
Breaking down Google ATAP’s approach to the future of computing, I find my own mixed feelings. On the one hand, I interviewed many of the researchers at Xerox PARC and associated technology labs who championed pervasive computing in the late 80s. Back then, integrating technology more naturally into our environments was a humanistic way for a world that had already been sucked into tiny screens and command lines. And I still want to see that future play out. On the other hand, Google and its ad tracking products are largely responsible for our modern, still-active state of capitalist surveillance. Any quieter, more integrated computing age would only embed this state more deeply into our lives. Google, like Apple, is imposing more sanctions on this front. But 82% of Google’s revenue came from ads in 2021, and to expect that to change overnight would be naïve.
As for when we might see these features in the devices Google actually ships, ATAP makes no promises. But I wouldn’t classify the work as meaningless academics. Under VP of Design Ivy Ross, Google’s hardware division has been sewing tough tech into our plush home environments for years. We’ve seen Google build charming fabric-covered smart speakers and measure the impact of different living rooms on a person’s stress level. They’re not disparate products, but they’re part of a more unified vision that we’re only beginning to glimpse: while most tech companies try to lay claim to the metaverse, Google seems content to take over our universe.