Natural Language Processing

What is natural language processing and what are its applications?

Natural language processing (NLP) brings together two disciplines as apparently distant as linguistics and artificial intelligence. Today, this field of computer science, which consists of transforming natural language into a formal language — such as programming — that computers can process, is constantly evolving and its applications are growing.

NLP allows a machine to process natural language and generate answers automatically.

If you have ever asked Alexa or Siri for the time, you will have realised that you do not always have to ask the question in the same way. You can ask "what time is it?" or "can you tell me the time?" and in both cases receive an appropriate response. The same is true of Google's automatic translator, which detects the nuances between different words depending on the context. These examples, and many more, have something called natural language processing (NLP) behind them.

What is natural language processing (NLP)?

According to IBM's definition, natural language processing (NLP) refers to the branch of computer science — and more specifically, the branch of artificial intelligence — concerned with giving computers the ability to understand text and spoken words in much the same way human beings can. This technology has now become highly advanced thanks to the application of technologies like machine learning (automatic learning), big data, the internet of things and neuronal networks.

Some of the most important applications focus on (business intelligence), which automatically analyses customer reactions through their comments on the internet or the questions they ask to get information. Then there are the chatbots, another application which, although there is much room for improvement, streamline interaction with customers through chats or telephone answering services by offering quick, automatic answers using natural language processing.

Natural language processing has its roots in the 1950s, when Alan Turing published a paper (Computing Machines and Intelligence) in which he proposed what is now known as the Turing Test. The test examined the ability of a machine to exhibit intelligent behaviour similar to that of a human being. Since then, the evolution of the algorithms associated with this technology has enabled the current progress to be made.

The evolution of natural language processing and its algorithms

1949
1950
1954
1956
1960s
1980s
1990s
2000s
2010s
2020s

IBM sponsors the Index Thomisticus, a compilation of the works of St. Thomas Aquinas created by the Italian Jesuit Roberto Busa (inventor of computer linguistics).

Alan Turing publishes the article Computational machines and intelligence, where he proposes the Turing Test to determine whether a machine can think or not.

The Georgetown-IBM experiment achieves the automatic translation of more than sixty sentences from Russian into English, giving a boost to computational linguistics.

John McCarthy, Marvin Minsky and Claude Shannon coin the term "artificial intelligence" at the Dartmouth Conference.

Pattern recognition and "nearest neighbour" algorithms are introduced.

Machine learning algorithms are introduced and natural language generation takes off.

Advanced speech recognition and topic modelling technologies are introduced.

More advanced statistical and topic models, such as LDA, are introduced. The term "deep learning" also emerged.

Translation with neural machines, i.e. without human intervention, is implemented and conversational artificial intelligence takes a leap forward.

More and more business sectors will apply this technology and, together with machine vision, it will enable the new challenges of Industry 4.0 to be met.

Source: Deloitte.

SEE INFOGRAPHIC: The evolution of natural language processing and its algorithms [PDF] External link, opens in new window.

How does natural language processing work

The first models of natural language analysis were symbolic and were based on manually encoding the rules of the language. This made it possible to distinguish, for example, the tenses and conjugations of verbs and to extract the meaning of the root. The 1980s and 1990s saw the statistical revolution. Instead of writing sets of rules (and exceptions) NLP systems began to use statistical inference algorithms to analyse other texts and make comparisons in search of patterns.

The advantage of statistical models is that they are more reliable in understanding new words or in detecting errors, such as misspelled or accidentally omitted words. Most current systems use a combination of symbolic and statistical models. In particular, natural language processing systems perform several types of analyses:

Morphological: focuses on distinguishing the different types of words (verbs, nouns, prepositions, etc.) and their variations (gender, number, tense, etc.).
Syntactical: separates sentences from each other and analyses their constituent parts (subject, verb, predicate) in order to extract their meaning.
Semantic: analyses the meaning, not only of individual words, but also of the sentences of which they are part and of the discourse as a whole.
Pragmatic: is responsible for extracting the intention of the text depending on its context and makes it possible to differentiate factors such as irony, ambiguity or mood.

What is Artificial Intelligence?

Are we aware of the challenges and main applications of Artificial Intelligence?

What is machine learning?

Discover the main benefits of machine learning.

AI Algorithms

Types of Artificial Intelligence Algorithms.

Internet of Things (IoT)

Are we ready for the world that new technologies will bring?

Applications of natural language processing (examples)

Your word processor's spellchecker or your phone's autocorrect use natural language processing techniques, but the applications go much further:

Virtual assistants and intelligent chatbots

Virtual assistants, such as Siri, Alexa and Google Assistant, use natural language processing to process the questions and commands users use and provide accurate and consistent responses. They are increasingly used on business websites to guide the user.

Document classification

The task of classifying large numbers of documents according to subject matter or style can be streamlined with NLP systems.

Sentiment and opinion analysis

Comments on social networks about products and services are extremely important to companies and NLP systems can extract relevant information from them.

Text comparison

NLP systems make it possible to find patterns in texts and detect matches between them, which facilitates plagiarism detection and quality control.

Document anonymisation

Through NLP systems, documents can be processed to identify and remove mentions of personal data, thus ensuring the privacy of individuals and institutions.

Machine Translation

Instant machine translation applications use natural language processing techniques to deliver accurate, semantically and grammatically correct foreign language texts.

Content recommendation

Content platforms use language preference analysis to suggest books, movies or songs. These applications analyse users' preferences to provide them with relevant content.

Our innovation model