Don’t overestimate AI’s understanding of human language

This article is part of Demystifying AI a series of messages that (try to) disambiguate the jargon and myths surrounding AI.

It is very easy to misinterpret and overestimate the results obtained in artificial intelligence. And this is nowhere more evident than in the realm of human language, where appearances can falsely allude to in-depth capabilities. Over the past year, many companies have given the impression that their chatbots, robots and other applications could engage in constructive conversations as a human being would.

Just look at Google Duplex Hanson Robotics & # 39; Sophia and many other stories to become convinced that we & # 39; We have reached a stage where artificial intelligence can manifest human behavior.

But mastery of human language requires much more than the reproduction of voices similar to that of a human or the production of well-formed sentences. This requires common sense, an understanding of the context and creativity, which does not have current trends in artificial intelligence.

To be honest, in-depth learning and other AI techniques have brought people and computers closer together. But there is still a huge divide that divides the world of circuits and binary data and the mysteries of the human brain. And if we do not understand and recognize the differences between AI and human intelligence we will be disappointed by the unfulfilled expectations and will miss the real opportunities that advances in artificial intelligence provide.

To understand the relationship between artificial intelligence and human language, we have decomposed the field into different subdomains, ranging from surface to depth.

Speech to the text

Voice transcription is one of the areas where artificial intelligence algorithms have made the most progress. In all fairness, this should not even be considered an artificial intelligence, but the very definition of AI is a bit vague, and as many people might incorrectly interpret automatic transcription as a manifestation of intelligence, we we decided to analyze it here.

Older versions of the technology forced programmers to go through the tedious process of discovering and coding rules for classifying and converting speech samples into text. Thanks to the advances of in-depth learning and deep neural networks vocal synthesis has made tremendous progress and has become both easier and easier. more precise.

With neural networks, instead of coding the rules, you provide many voice samples and their corresponding text. The neural network finds common patterns in the pronunciation of words, and then "learns" to map new voice recordings to the corresponding texts.

These advances have allowed many services to provide real-time transcription services to their users.

There are many uses for speech synthesis based on AI. Google recently introduced Call Screen a feature on Pixel Phones that handles fraudulent calls and shows you the text of the person who is speaking in real time. YouTube uses in-depth learning to provide automated captioning.

But the fact that an artificial intelligence algorithm can turn a voice into a text does not mean that it understands what it is dealing with.

Voice synthesis

The opposite side of speech synthesis is speech synthesis. Again, it's really not intelligence, because it has nothing to do with understanding the meaning and context of human language. But it is nevertheless an integral part of many applications that interact with humans in their own language.

Like speech synthesis, speech synthesis has been around for quite some time. I remember seeing computerized speech synthesis for the first time in a laboratory in the 90's.

ALS patients who have lost their voice have been using technology for decades, communicating by typing sentences and having them read by a computer. Blind people also use technology to read text they can not see.

However, at the time, the voice generated by computers did not appear human, and creating a vocal pattern required hundreds of hours of coding and tuning. Now, with the help of neural networks, the synthesis of the human voice has become less burdensome.

The process involves the use of networks generating opposition (GAN) an AI technique that opposes neural networks to each other to create new data. First, a neural network ingests many of a person's voice samples until it knows if a new voice sample belongs to the same person.

Next, a second neural network generates audio data and browses through the first to see if they are validated as belonging to the subject. If this is not the case, the generator corrects its sample and reruns it via the classifier. Both networks repeat the process until they are able to generate natural sound samples.

Several websites allow you to synthesize your own voice with the help of neural networks. The process is as simple as providing it with enough samples of your voice, which is far less than what previous generation technologies need.

This technology has many interesting uses. For example, companies use AI-based speech synthesis to improve the experience of their customers and give their brand a single voice.

In the field of medicine, Amnesty International helps patients with ALS to to recover their true voice instead of using a computerized voice. And of course, Google uses Duplex technology to make calls on behalf of users with their own voice.

The speech synthesis of AI also has perverse uses. Namely, it can be used to falsify make calls to the voice of a targeted person or spread false information by imitating the voice of a leader of & # 39; State or a prominent personality. politician.

I guess I do not need to remind you that if a computer may seem like a human, that does not mean that it understands what is written.

Order Processing in Human Language

The source:

This is where we enter the surface and enter the depth of the relationship of AI with human language. In recent years, great progress has been made in the field of Natural Language Processing (NLP) still thanks to advances in deep learning.

NLP is a subset of artificial intelligence that allows computers to discern the meaning of written words, that they convert speech into text, receive them via a text-based interface such as text. a chatbot or read them in a file. They can then use the meaning behind these words to perform a given action.

But NLP is a very broad field and can involve many different skills. In its simplest form, NLP will help computers to execute the commands given to them by means of text commands.

Smart Artificial Intelligence Assistants and smartphones use NLP to process user commands. Basically, this means that the user does not have to stick to a strict sequence of words to trigger an order and that he can use different variations of the same sentence.

Elsewhere, NLP is one of the technologies used by Google's search engine to understand the broader meaning of user queries and to return the results relevant to the query.

Analytics tools like Google Analytics and IBM Watson, where users can use natural language sentences to query their data instead of writing complex sentences, are also very useful for NLP.

The smart answer feature of Gmail is an interesting use of NLP. Google examines the contents of an email and presents suggestions for answers.

This feature has a limited scope and only works for emails for which short answers have meaning, for example when Google's artificial intelligence algorithms detect a scheduled meeting or when the sender Expects a simple "Thank you" or "I'll take a look." But sometimes it provides rather neat answers that can save you a few seconds of typing, especially if you're on a mobile device.

But it's not because a smart speaker or an artificial intelligence assistant can react to different ways of asking for the weather that the weather is understood, which does not mean that she understands the language perfectly human.

Current NLP really only allows you to understand sentences whose meaning is very clear. AI assistants are becoming better able to execute basic orders, but if you think you can engage in meaningful conversations and discuss abstract topics with them, you will have a big disappointment .

Speaking in human language

The counterpart of NLP is the Natural Language Generation (LNG), the discipline of AI that allows computers to generate text that makes sense to humans.

This area has also benefited from AI progress, especially in-depth learning. The output of the NLG algorithms can be displayed as text, as in a chatbot, or converted to speech by speech synthesis and played for the user, as do the smart speakers and the intelligence assistants. artificial.

In many cases, LNG is closely related to NLP and, like NLP, it is a very large area that can involve different levels of complexity. NLG basic levels have very interesting uses. For example, LNG can transform tables and spreadsheets into textual descriptions . Artificial intelligence assistants such as Siri and Alexa also use NLG to generate responses to queries.

The Gmail autocomplete feature uses the NLG in a very interesting way. When you type a phrase, Gmail suggests that you complete the sentence, which you can select by tapping or tapping. The suggestion takes into account the general subject of your letter, which means that NLP is also concerned.

Some publications use AI to write basic reports . Some reporters tell how artificial intelligence will soon replace human writers, but their proposal is no farther from the truth.

The technology behind these writing robots is the NLG, which turns facts and figures into stories by analyzing the style used by human journalists to write reports. He can not suggest new ideas, write features that tell stories and personal stories, or write editorials that introduce and develop an opinion.

Another interesting case study is Google Duplex. Google's Artificial Intelligence Wizard defines both the capabilities and the limits of the artificial intelligence of human language. Duplex brilliantly combines speech synthesis, NLP, LNG and speech synthesis, persuading many people to think that it can interact as a human interlocutor.

But Google Duplex is of narrow artificial intelligence which means that it will be able to perform the type of tasks that it does. presented business, such as booking a restaurant or making an appointment at a salon. These are areas where problematic space is limited and predictable. You can not say a lot when you talk about booking a table at a restaurant.

But Duplex does not understand the context of his conversations. It is simply a matter of converting human language into commands and the output of computers into human language. He will not be able to conduct meaningful conversation on abstract topics, which can take unpredictable directions.

Some companies that exaggerated the processing and language-generating capabilities of their AI eventually hired people to fill the void .

Automatic translation

Source: Jon Russell / Flickr

In 2016, The New York Times Magazine lasted a long time This feature explains how AI, or more specifically deep learning, has allowed Google's popular translation engine to make progress in terms of accuracy. To be true, Google Translate is immensely improved.

But the AI-based translation has its own limitations, which I also experience regularly. Neural networks translate different languages ​​using a statistical and mechanical process. For example, they illustrate the different reasons that words and phrases appear in the target languages ​​and try to choose the most convenient when translating. In other words, they correspond to mathematical values ​​without translating the meaning of the words.

By contrast, when humans perform translations, they take into account the culture and context of languages, the history of words and proverbs. They do research on the substance of the subject before making decisions about words. It's a very complex process that involves a lot of common sense and abstract understanding, none of which has a contemporary artificial intelligence.

Douglas Hofstadter, Professor of Cognitive Science and Comparative Literature at Indiana University in Bloomington, Unveils the Limits of AI Translation in this excellent work of of the Atlantic

Let's be clear, the translation of AI has many very practical uses. I use it frequently to speed up my work when I translate from French to English. It's almost perfect for translating simple, factual sentences.

For example, if you communicate with people who do not speak your language and want to rather understand the meaning of a phrase rather than the quality of the translation, artificial intelligence applications such as Google Translate can be a useful tool. very useful tool.

But do not expect AI to replace professional translators in the near future.

What we need to know about AI's understanding of human language

First of all, we must recognize the limits of deep learning which constitutes for the moment the forefront of artificial intelligence. In-depth learning does not understand human language. Period. Things can change when someone deciphers the code to create an artificial intelligence that can make sense of the world, such as the human mind, or an artificial intelligence in general. But it's not coming soon.

As shown by most examples, artificial intelligence is a technology to increase the man and can help speed up or facilitate tasks involving the use of human language. But he still lacks the ability to solve abstract problems and common sense that would allow him to fully automate the disciplines requiring the mastery of human language.

So, the next time you see an artificial intelligence technology that looks, looks a lot like a human and acts like a human being, explore the depth of his understanding of human language. You will be better placed to understand its capabilities and limitations. Appearances can be deceiving.

This story is republished in TechTalks the blog that explores the role of technology in solving problems … and creating new problems. Like them on on Facebook Here and follow them here:

Leave a Reply

Your email address will not be published.