Voice assistants, AI and the cloud: An illusion of performance and privacy

Snips

With AI, the choice between cloud performance and respect for consumer privacy is a false one. Not only can AI perform well locally, but through embedded technologies, it can also personalize understanding and thereby refine it.

The voice-activated personal assistants developed by U.S. tech giants currently flooding the market work mainly in the cloud. Despite the big players wanting to shape the world according to this 100% cloud model, it deprives users of the ability to control their own data. The only truly private solution is the use of embedded technologies, which does not require that user interactions are accessed by any third-party server.

The superiority of the cloud: A myth

A misconception, knowingly maintained by the Big Four tech companies, is that technologies hosted in the cloud perform better than those embedded locally. Although the cloud, by definition, has infinite computing capacity, cloud resources do not categorically make any significant impact on performance. For the everyday use of voice technologies, for example, having enormous computing power is simply unimportant.

The most advanced AI technologies, especially machine learning technologies, often have no need of the cloud. Embedded machine learning has become commonplace. Major smartphones and computers on the market use it for common tasks, like face identification, and in a range of other applications.

The arrival of neuromorphic chips: An impending technological leap

While it is often possible to have the same performance locally as in the cloud, the arrival of neuromorphic chips further eliminates any perceived advantages for relying on the cloud. These new chips are designed specifically for implementation of neural networks, brain-inspired models that have revolutionized artificial intelligence. Simply put, this new generation of chips makes it possible to embed fairly complicated AI without the need for the cloud’s compute power. The imminent arrival of these chips on the market will mean a technological leap in everyday terms. High-end phones are already equipped with neuromorphic chips, and the coming year will see their foray into everyday objects, including speakers, televisions and home appliances.

Continuing improvement in AI on the cloud: Another myth

A fantasy for major voice-activated assistants, such as Google Assistant or Amazon Alexa, is the ability to boil all their users’ data in the same cauldron. According to them, passage of data through the cloud should improve AI continuously and thereby perfect it without limits. This mindset is the basis for sharing user data without necessarily understanding the reasons or the implications.

For example, when granting Alexa this kind of access, few users imagined that recordings made in their homes might be shared with thousands of Amazon employees or subcontractors in the United States, India, Costa Rica, Canada or Romania to be manually categorized with the goal of enhancing the voice assistant’s performance. The need for such collection and manual labeling efforts is disputed, as competing technologies reach the same levels of embedded performance, without the need of user data for training.

Embedded technology: Toward customizable AIs

Besides the total lack of respect for user privacy, this mixing of data in the cloud has the further disadvantage of giving rise to a very generic intelligence, lacking in any sort of precision or specificity. In a generic speech comprehension model used by everyone regardless of the queries, the words are weighted according to their general probability of occurrence. For example, “Barack Obama” will carry more weight than a less popular name. So, when you say something phonetically similar like “Barbara” and the voice assistant does not hear you clearly, it is likely to assume you meant to say more popular phrase. This generic approach is therefore limited in that it does not take into account the context of the research itself.

By comparison, embedded speech recognition becomes inherently contextualized, meaning that if you are talking to a smart speaker it will know that you are referencing the musical domain and will not search for “Barack Obama” as an artist when queried with “Barbara.” Thanks to embedded machine learning, this tool will even be able to enrich users’ personal tastes, creating a customizable AI according to a person’s specific needs.

As technology moves toward an “everything connected” approach, including mail, contacts, files, chat history and so forth, the risks of going through the cloud only become more concerning. Data breaches and privacy mishaps no longer deal with a specific application or tool, but now encompass a user’s entire life. As embedded machine learning continues to make strides, it is likely only a matter of time before users no longer see the reason to risk their privacy when using their AI voice assistants.

All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.