Natural language processing is a discipline that combines linguistics and data processing. The first research in this field started between 1940 and 1960 and were focused on translation machines (TM). Unfortunately, the progress made during this phase did not meet the very high and enthusiastic expectations of the community.
Thus, it was necessary to widen the research area. Between 1960 and 1970 the concept of artificial intelligence made it possible to build semantic representations of the language based on the available knowledge. This phase knew the first attempts to build natural language comprehension systems.
Among the systems that were successful in their scopes of application, we find SHRDLU, an English software that allows an interactive textual exchange with the user. The system allows the user to move blocks of various geometrical forms (cubic, cones.) in a “world of blocks” only by communicating with the system. The success of SHRDLU was due to the combination of simple ideas which makes the simulation of “comprehension” much more convincing.
ELIZA is a computer program written by Joseph Weizenbaum that simulates a rogerien psychotherapist. Eliza understands patient assertions and reformulates them into questions. Eliza was the first model that introduced the concept of ChatBot or conversational agent.
The ChatBot term or virtual agent indicates a computer program able to maintain and stimulate intelligent conversations with one or more human beings in order to provide answers or to make appropriate recommendations.
However, these first systems of human language interpretation were very elementary and unable to understand the user requests or to provide suitable answers. Therefore, leading advanced searches in artificial intelligence is an essential step toward building stronger “intelligent agents”. The task is not obvious though. Until now, natural language processing is a very active research area of the IA.
Since the 80’s, NLP approaches were based on supervised, semi-supervised or unsupervised “machine learning” methods. Even though training machines with data was presenting encouraging results thanks to the design of statements representations and system entry characteristics of NLP systems that consist of weights optimization for prediction, the research results showed weaknesses at the interpretation complexity level which indeed requires an in-depth advanced training called “deep learning” through stronger algorithms that try to learn several levels of complexity and abstraction representation.
The problem of translating a human assertion into a logical structured request constitutes the heart of the NLP task. The understanding of written and spoken statements by machine goes through a set of lexical and syntactic analysis such as the tagging of sequences, POS tagging, stemming, tokenising… in order to classify the contents in semantic categories. During this process NLP tasks face many difficulties related to the language nature. Three levels of language ambiguity exist:
- Polysemic lexical ambiguity characterizes a word or an expression that has several different meanings or meanings in its original language. Like the word ‘match’ that means at the same time a sport competition and join or pair with something. Different ways of applying prefixes and suffixes and conjugation terminations can also create ambiguity.
- Syntactic ambiguity: this is the case of sentences that can accept several interpretations because of structure, in other words syntax. For example, “I bought fruit from Africa” could mean that the person bought fruit from African origins (not necessarily moving in Africa) or bought it from Africa directly).
- Semantic Ambiguity: Refers to ambiguity in the attribution of meaning to the utterance. The complexity lies in the multitude of interpretations and the logical relations between the words and groups of words, generally between two propositions for example: “like so many others, he left her” means “he left her as so many other boys did.”
Nowadays research are oriented toward (deep learning) in semantic classification of the statements, for example professors: Y. Chen, D.Hakkani-Tur and X. He, proposed a work on a convolutional in-depth structured semantic model (CDSSM) applied to learn at the same time the representations for the human intentions and the associated statements. The model is designed for “zero-shot” training. The discussion and the analysis of the experiments provide a future direction to reduce the human effort on the annotation of the data and to reduce the constraints in the verbal conversational systems. Other similar approaches are based on techniques of semantic classification of statements for the categories with no preliminary training data. The training is done in this case using major neurons networks, convex neurons networks or recurring neurons networks for the filling slots.
The recent progress brought by the deep learning, the transfer learning, and the networks of recurring neurons made it possible to make a significant jump in the comprehension of the natural language. However, this work is still in their first experimental phases. Many current research tasks let think that major advances will still take place.
Ghita, Consultant, Leyton France