Open-Domain Conversational AI | Encyclopedia MDPI

Open-Domain Conversational AI: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor:

There are different opinions as to the definition of AI, but according to, it is any computerised system exhibiting behaviour commonly regarded as requiring intelligence. Conversational AI, therefore, is any system with the ability to mimick human–human intelligent conversations by communicating in natural language with users. Conversational AI, sometimes called chatbots, may be designed for different purposes. Open-domain conversational AI models are known to have several challenges, including bland, repetitive responses and performance degradation when prompted with figurative language, among others.

conversational systems
SoTA

1. Introduction

Conversational AI, sometimes called chatbots, may be designed for different purposes. These purposes could be for entertainment or solving specific tasks, such as plane ticket booking (task-based). When the purpose is to have unrestrained conversations about, possibly, many topics, then such AI is called open-domain conversational AI. ELIZA, by ^[1], is the first acclaimed conversational AI (or system). Human interaction with the system demonstrated how engaging its responses could be ^[2]. The staff of ^[1] reportedly became engrossed with the program during interactions and possibly had private conversations with it ^[2].

Modern SoTA open-domain conversational AI aims to achieve better performance than what was experienced with ELIZA. There are many aspects and challenges to building such SoTA systems. Therefore, the primary objective of this survey is to investigate some of the recent SoTA open-domain conversational systems and identify specific challenges that still exist that should be surmounted to achieve “human” performance in the “imitation game”, as described by ^[3]. As a result of this objective, this survey will identify some of the ways of evaluating open-domain conversational AI, including the use of automatic metrics and human evaluation.

2. Related Simple Rule-Based Template System

Open-domain conversational AI may be designed as a simple rule-based template system or may involve complex Artificial Neural Network (ANN) architectures. Indeed, six approaches are possible: (1) rule-based method, (2) reinforcement learning (RL) that uses rewards to train a policy, (3) adversarial networks that utilise a discriminator and a generator, (4) retrieval-based method that searches from a candidate pool and selects a proper candidate, (5) generation-based method that generates a response word by word based on conditioning, and a (6) hybrid method that combines two or more of the earlier methods ^[2]^[4]^[5]. Certain modern systems are still designed in the rule-based style that was used for ELIZA ^[2]. The ANN models are usually trained on large datasets to generate responses; hence, they are data-intensive. The data-driven approach is more suitable for open-domain conversational AI ^[2]. Such systems learn inductively from large datasets involving many turns in conversations, such as Topical-Chat ^[6]^[7]. A turn (or utterance) in a conversation is each single contribution from a speaker ^[2]^[8]. The data may be from written conversations, such as the MultiWOZ ^[9], transcripts of human–human spoken conversations, such as the Gothenburg Dialogue Corpus (GDC) ^[10], crowdsourced conversations, such as the EmpatheticDialogues ^[11], and social media conversations such as Familjeliv (familjeliv.se) or Reddit (reddit.com) ^[12]^[13]. As already acknowledged that the amount of data needed for training deep ML models is usually large, they are normally first pretrained on large, unstructured text or conversations before being fine-tuned on specific conversational data.

3. Characteristics of Human Conversations

Humans converse using speech and other gestures that may include facial expressions, usually called body language, thereby making human conversations complex ^[2]. Similar gestures may be employed when writing conversations. Such gestures may be clarification questions or the mimicking of sound (onomatopoeia). In human conversations, one speaker may have the conversational initiative, i.e., the speaker directs the conversation. This is typical in an interview where the interviewer asking the questions directs the conversation. It is the style for Question Answering (QA) conversational AI. In typical human–human conversations, the initiative shifts to and from different speakers. This kind of mixed (or rotating) initiative is harder to achieve in conversational systems ^[2]. In addition to conversation initiative, below are additional characteristics of human conversations, according to ^[14].

Usually, one speaker talks at a time.
The turn order varies.
The turn size varies.
The length of a conversation is not known in advance.
The number of speakers/parties may vary.
Techniques for allocating turns may be used.
Content of the conversation is not known in advance.
The relative distribution of turns is unknown in advance.
Different turn-constructional unit may be used, e.g., words or sentences.
Repair mechanisms for correcting turn-taking errors exist.

4. Ethics

Ethical issues are important in open-domain conversational AI. The perspective of deontological ethics views objectivity as being equally important ^[15]^[16]^[17]. Deontological ethics is a philosophy that emphasises duty or responsibility over the outcome achieved in decision making ^[18]^[19]. Responsible research in conversational AI requires compliance to ethical guidelines or regulations, such as the General Data Protection Regulation (GDPR), which is a regulation protecting persons with regard to their personal data ^[20]. Some of the ethical issues that are of concern in conversational AI are privacy, due to personally identifiable information (PII), toxic/hateful messages as a result of the training data, and unwanted bias (racial, gender, or other forms) ^[2]^[21].

Some systems have been known to demean or abuse their users. It is also well known that machine learning systems reflect the biases and toxic content of the data they are trained on ^[2]^[22]. Privacy is another crucial ethical issue. Data containing PII may fall into the wrong hands and cause a security threat to those concerned. It is important to have systems designed such that they are robust to such unsafe or harmful attacks. Attempts are being made with debiasing techniques to address some of these challenges ^[23]. Privacy concerns are also being addressed through anonymisation techniques ^[2]^[24]. Balancing the features of chatbots with ethical considerations can be delicate and challenging work. For example, there is contention in some quarters whether using female voices in some technologies/devices is appropriate. Then again, one may wonder if there is anything harmful about that. This is because it seems to be widely accepted that the proportion of chatbots designed as “female” is larger than those designed as “male”. In a survey of 1375 chatbots, from automatically crawling chatbots.org, ref. ^[25] found that most were female.

5. Benefits of Conversational AI

The apparent benefits inherent in open-domain conversational AI has spurred research in the field. These benefits have led to multi-million-dollar investments in conversational AI by many organisations, including Apple ^[2] and Amazon ^[26]. Some of the benefits include:

Provision of ’friendly’ company, as was probably experienced with Kuki (formerly Mitsuku) ^[27] and ELIZA (though it was not intended to provide such company). Some of the staff of ^[1] reportedly found comfort in holding private conversations with the conversational agent ^[2].
Provide support for users with disabilities, such as blindness ^[28]. Speech-to-text (STT) and text-to-speech (TTS) technologies combined with conversational AI can make life easier for people with disabilities.
A channel for providing domain/world knowledge ^[28]. The IR approach discussed earlier can make it possible to have up-to-date information on specific domains or topics through conversational AI.
The provision of educational content or information in a concise fashion ^[29]. As mentioned earlier, the content and length of a conversation are not known in advance, so it is possible to construct utterances that are relatively concise and to the point.
Automated machine–machine generation of quality data for low-resource languages ^[13]. The challenge of data scarcity for low-resource languages may be mitigated through quality data generated from autonomous machine–machine conversations on various topics and about different entities.
The possibility of modelling human psychiatric/psychological treatment ^[2] on the basis of favorable behavior determined from experiments which are designed to modify input–output behaviour.

This entry is adapted from the peer-reviewed paper 10.3390/info13060298

References

Weizenbaum, J. A Computer Program for the Study of Natural Language. Fonte: Stanford. 1969. Available online: Http://web.stanford.edu/class/linguist238/p36 (accessed on 25 May 2022).
Jurafsky, D.; Martin, J. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition; Dorling Kindersley Pvt, Ltd.: London, UK, 2020.
Turing, A.M. Computing machinery and intelligence. Mind 1950, 59, 433–460.
Adiwardana, D.; Luong, M.T.; So, D.R.; Hall, J.; Fiedel, N.; Thoppilan, R.; Yang, Z.; Kulshreshtha, A.; Nemade, G.; Lu, Y.; et al. Towards a human-like open-domain chatbot. arXiv 2020, arXiv:2001.09977.
Chowdhary, K. Natural Language Processing for Word Sense Disambiguation and Information Extraction. 2020, pp. 603–649. Available online: https://arxiv.org/ftp/arxiv/papers/2004/2004.02256.pdf (accessed on 25 May 2022).
Gabriel, R.; Liu, Y.; Gottardi, A.; Eric, M.; Khatri, A.; Chadha, A.; Chen, Q.; Hedayatnia, B.; Rajan, P.; Binici, A.; et al. Further advances in open domain dialog systems in the third alexa prize socialbot grand challenge. Alexa Prize Proc. 2020, 3. Available online: https://assets.amazon.science/0e/e6/2cff166647bfb951b3ccc67c1d06/further-advances-in-open-domain-dialog-systems-in-the-third-alexa-prize-socialbot-grand-challenge.pdf (accessed on 25 May 2022).
Gunasekara, C.; Kim, S.; D’Haro, L.F.; Rastogi, A.; Chen, Y.N.; Eric, M.; Hedayatnia, B.; Gopalakrishnan, K.; Liu, Y.; Huang, C.W.; et al. Overview of the ninth dialog system technology challenge: Dstc9. arXiv 2020, arXiv:2011.06486.
Schegloff, E.A. Sequencing in conversational openings 1. Am. Anthropol. 1968, 70, 1075–1095.
Eric, M.; Goel, R.; Paul, S.; Sethi, A.; Agarwal, S.; Gao, S.; Kumar, A.; Goyal, A.; Ku, P.; Hakkani-Tur, D. MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, 20–25 June 2020; pp. 422–428.
Allwood, J.; Grönqvist, L.; Ahlsén, E.; Gunnarsson, M. Annotations and tools for an activity based spoken language corpus. In Current and New Directions in Discourse and Dialogue; Springer: Berlin/Heidelberg, Germany, 2003; pp. 1–18.
Rashkin, H.; Smith, E.M.; Li, M.; Boureau, Y.L. Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 5370–5381.
Adewumi, T.; Brännvall, R.; Abid, N.; Pahlavan, M.; Sabry, S.S.; Liwicki, F.; Liwicki, M. Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning. In Proceedings of the 5th Northern Lights Deep Learning Workshop, Tromsø, Norway, 10–12 January 2022; Volume 3.
Adewumi, T.; Adeyemi, M.; Anuoluwapo, A.; Peters, B.; Buzaaba, H.; Samuel, O.; Rufai, A.M.; Ajibade, B.; Gwadabe, T.; Traore, M.M.K.; et al. Ìtàkúròso: Exploiting Cross-Lingual Transferability for Natural Language Generation of Dialogues in Low-Resource, African Languages. arXiv 2022, arXiv:2204.08083.
Sacks, H.; Schegloff, E.A.; Jefferson, G. A simplest systematics for the organization of turn taking for conversation. In Studies in the Organization of Conversational Interaction; Elsevier: Amsterdam, The Netherlands, 1978; pp. 7–55.
Adewumi, T.P.; Liwicki, F.; Liwicki, M. Conversational Systems in Machine Learning from the Point of View of the Philosophy of Science—Using Alime Chat and Related Studies. Philosophies 2019, 4, 41.
Javed, S.; Adewumi, T.P.; Liwicki, F.S.; Liwicki, M. Understanding the Role of Objectivity in Machine Learning and Research Evaluation. Philosophies 2021, 6, 22.
White, M.D. Immanuel kant. In Handbook of Economics and Ethics; Edward Elgar Publishing: Cheltenham, UK, 2009.
Alexander, L.; Moore, M. Deontological Ethics. 2007. Available online: https://plato.stanford.edu/entries/ethics-deontological/ (accessed on 25 May 2022).
Paquette, M.; Sommerfeldt, E.J.; Kent, M.L. Do the ends justify the means? Dialogue, development communication, and deontological ethics. Public Relat. Rev. 2015, 41, 30–39.
Voigt, P.; Von dem Bussche, A. The EU General Data Protection Regulation (GDPR): A Practical Guide; Springer: Berlin/Heidelberg, Germany, 2017; Volume 10, pp. 10–5555.
Zhang, Y.; Sun, S.; Galley, M.; Chen, Y.C.; Brockett, C.; Gao, X.; Gao, J.; Liu, J.; Dolan, B. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Online, 5–10 July 2020; pp. 270–278.
Neff, G.; Nagy, P. Automation, algorithms, and politics| talking to Bots: Symbiotic agency and the case of Tay. Int. J. Commun. 2016, 10, 17.
Dinan, E.; Fan, A.; Williams, A.; Urbanek, J.; Kiela, D.; Weston, J. Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 8173–8188.
Henderson, P.; Sinha, K.; Angelard-Gontier, N.; Ke, N.R.; Fried, G.; Lowe, R.; Pineau, J. Ethical challenges in data-driven dialogue systems. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, New Orleans, LA, USA, 2–3 February 2018; pp. 123–129. Available online: https://arxiv.org/pdf/1711.09050.pdf (accessed on 25 May 2022).
Maedche, A. Gender Bias in Chatbot Design. In Chatbot Research and Design; Springer: Heidelberg, Germany, 2020; p. 79.
Venkatesh, A.; Khatri, C.; Ram, A.; Guo, F.; Gabriel, R.; Nagar, A.; Prasad, R.; Cheng, M.; Hedayatnia, B.; Metallinou, A.; et al. On evaluating and comparing conversational agents. arXiv 2018, arXiv:1801.03625.
Ruane, E.; Birhane, A.; Ventresque, A. Conversational AI: Social and Ethical Considerations. In Proceedings of the AICS—27th AIAI Irish Conference on Artificial Intelligence and Cognitive Science, Galway, Ireland, 5–6 December 2019; pp. 104–115.
Reiter, E. 20 Natural Language Generation. In The Handbook of Computational Linguistics and Natural Language Processing; 2010; p. 574. Available online: https://onlinelibrary.wiley.com/doi/10.1002/9781444324044.ch20 (accessed on 25 May 2022).
Kerry, A.; Ellis, R.; Bull, S. Conversational agents in E-Learning. In Proceedings of the International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, UK, 9–11 December 2008; pp. 169–182. Available online: https://link.springer.com/chapter/10.1007/978-1-84882-215-3_13 (accessed on 25 May 2022).

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.