Don’t leave the ecosystem –

Upgraded data protection and reduced reliance on the cloud can lock users in.

Getty Images

Since the iPhone, much of the intelligence in smartphones has come from somewhere else: corporate computers called the cloud. Mobile apps send user data to the cloud for useful tasks, such as transcribing voice or suggesting message responses. Now, Apple and Google say smartphones are smart enough to do some critical and sensitive machine learning tasks on their own.

At Apple’s WWDC conference this month, the company said Siri, its virtual assistant, would be transcribing speech in some languages without using the cloud on iphones and ipads in the near and future. At its I/O developer conference last month, Google said that the latest version of its Android operating system had a feature specifically designed to handle sensitive data on devices, called the private computing core. Its initial uses include powering a version of the smart Reply feature in the company’s mobile keyboard, which responds to incoming messages.

Both Apple and Google say machine learning on devices offers more privacy and faster apps. Not transmitting personal data reduces the risk of exposure and saves time waiting for data to cross the Internet. At the same time, keeping data on devices serves the tech giants’ long-term interest in tethering consumers to their ecosystems. People may become more willing to agree to share more data when they hear that their data can be processed more privately.

The companies’ latest push for machine learning on devices comes after years of technical work to limit the data they can “see” in their cloud.

In 2014, Google began collecting some data about usage of its Chrome browser through a technique called “differentiated privacy,” which adds noise to the data collected to limit what the samples can reveal about individuals. Apple uses the technology for data collected from phones, providing information for emoji and typing predictions, and for web browsing data.

More recently, both companies have adopted a technique called co-learning. It allows a cloud-based machine learning system to make updates without scooping in raw data; Instead, individual devices process the data locally and share only digested updates. As with differentiated privacy, these companies have talked about using federated learning only in limited circumstances. Google already uses the technology to keep its mobile typing predictions abreast of language trends; Apple has published studies using the technology to update speech recognition models.

Rachel Cummings, an assistant professor at Columbia University who has advised Apple on privacy, said the rapid shift to doing some machine learning on the phone is striking. “It’s really rare to see something go from conception to large-scale deployment in such a short period of time,” she said.

This progress will require not only advances in computer science, but also companies to take up the practical challenge of processing data on consumer-owned devices. Google says its joint learning system only works if a user’s device is plugged in, idle and has a free Internet connection. The technology has been made possible in part by improvements in the power of mobile processors.

More powerful mobile hardware also helped Google’s announcement in 2019 that speech recognition for its virtual assistant on its Pixel devices would be done entirely on the device, untethered to cloud computing. Siri’s new device speech recognition, announced by Apple at WWDC this month, will use a “neural engine” the company has added to its mobile processors to enhance machine learning algorithms.

These technical achievements are impressive. But the extent to which they will meaningfully change users’ relationship with the tech giants is debatable.

Speakers at Apple’s WWDC described Siri’s new design as a “major privacy update” that addresses the risks associated with accidentally transferring audio to the cloud, citing this as the biggest privacy concern users have about voice assistants. Some Siri commands — like setting a timer — can be fully recognized locally, allowing for quick responses. In many cases, however, commands transcribed to Siri — presumably including those from accidental recordings — will be sent to Apple’s servers, where software will decode and respond. For HomePod smart speakers, which are typically installed in bedrooms and kitchens, Siri’s voice transcribed will still be cloud-based, in which case accidental recordings could be even more of a concern.

Google has also promoted data processing on devices as a privacy victory and indicated that it will expand the practice. The company wants partners like Samsung that use its Android operating system to adopt the new private computing core and use it for functions that rely on sensitive data.

Google has also made local analysis of browsing data a feature of its revamped online AD targeting proposal, known as FLOC, which claims to be more private. Academics and some rival tech companies say the design could help Google cement its dominance in online advertising and make it harder for other companies to target.

Michael Veale, a lecturer on digital rights at University College London, says data processing on devices can be a good thing, but adds that the way tech companies are pitching suggests their main motivation is a desire to tie people into lucrative digital ecosystems.

“Privacy is confused with data privacy, but it’s also about limiting power, “Veale said. “If you’re a big tech company and you manage to recasts privacy as just data, that allows you to continue to do business as usual and gives you permission to operate.”

A Google spokesman said the company “establishes privacy everywhere computing takes place” and that data sent to a private computing core for processing “needs to be tied to user value.” Apple did not respond to a request for comment.

Columbia’s Mr. Cummings said new privacy technologies and the way companies market them add complexity to the trade-offs of digital life. In recent years, as machine learning has been widely deployed, tech companies have steadily expanded the range of data they collect and analyze. There is evidence that some consumers misunderstand the tech giants’ vaunted privacy protections.

A forthcoming study by Cummings and collaborators at Boston University and the Max Planck Institute presented 675 Americans with descriptions of differentiated privacy drawn from tech companies, the media and academia. After hearing about the technology, people were twice as likely to report that they were willing to share data. But there is evidence that descriptions of differentiated privacy also encourage unrealistic expectations. One in five respondents wanted their data to be protected from searches by law enforcement, which differentiated privacy does not. The latest announcements from Apple and Google about the processing of data on devices could open up new opportunities for misunderstanding.

This story originally appeared on Wired.com.