The bank agent is often unaware of the precise issues the ATM could encounter. However, it is possible to describe the abnormalities which shape the conversation’s context. Given these circumstances, determining an approach to address this barrier is necessary.
2. Background
2.1. Natural Language Understanding
Natural language understanding (NLU) is a subfield of artificial intelligence (AI) that aims to enable machines to comprehend and analyze natural language based on concepts, entities, sentiments, and keywords
[17] using machine learning and natural language processing (NLP) techniques. At the core of most chatbot applications lies NLU
[18]. Chatbots need to allow the extraction of structured information from natural language inputs with unstructured semantic inputs, such as natural language inputs
[19]. Nevertheless, developing an entire NLU system is challenging and expensive; therefore, the standard market practice consists of employing NLU solutions to create chatbots
[20][21].
In the context of chatbots, NLU is utilized to analyze the question asked by the user and complete their requests
[11]. To achieve this, NLU extracts domain-specific entities and intents from a text. Intents are the intentions or purposes with which the user asks a question. Intent recognition (IR) is a crucial component of NLU that involves identifying the intent behind a user’s query
[22]. Named entity recognition (NER) is another subtask of NLU that includes identifying and categorizing named entities in text or speech. Named entities are objects, people, locations, organizations, and other entity names in natural language
[23][24].
To illustrate this with an example, consider the ATM maintenance process. In this context, the words “on” and “off” are classified as entities representing the status of a red LED on the receipt printer. These entities might carry different implications in different scenarios, highlighting the importance of context in NLU. Now, when a user inputs “The LED is on”, the NLU system springs into action. The intent recognition (IR) component identifies the intent of the user’s message—in this case, to report the status of the LED. Concurrently, the named entity Recognition (NER) component pinpoints the word “on” as an entity specifying the LED’s current status.
By utilizing these entities and intents, an NLU engine can generate a targeted response that can efficiently assist users with their queries
[25]. Thus, IR and NER—intent and entity recognition—form the twin pillars of NLU. Their applications are extensive, ranging from information extraction and text understanding to the creation of knowledge bases
[24].
2.2. Virtual Assistant Types
When developing chatbots, it is crucial to consider the criteria for classification. The classification standards include the objectives, development techniques, communication style, and knowledge domain
[26]. Additionally, some authors Adamopoulou and Moussiades
[10] define criteria based on the platforms utilized to build a chatbot. They can be either from closed or open platforms. One of the most popular frameworks from a closed platform for developing conversational agents is Google Dialogflow CX, which offers a range of features for designing, building, and deploying conversational experiences.
Regarding the development method, the authors divide the types of chatbots between rule-based, also known as template-based
[27], and AI-based
[12][28]. Furthermore, AI-based chatbots can be further categorized into retrieval-based and general-based chatbots
[10][12].
Rule-based chatbots, also known as pattern-matching chatbots, are a type of chatbot architecture that utilizes a set of predefined rules and patterns to generate responses to user input
[29]. They are simple and easy to design and implement and perform effectively for structured tasks with a limited range of possible inputs
[12], providing prompt and precise responses to user queries and requests; moreover, they do not require abundant training data availability or machine learning algorithms. The predefined rules and patterns that generate responses are typically created by experts and cover a range of possible user inputs and scenarios. When a user enters a query or request, the chatbot matches the input against the predefined rules and patterns to determine the appropriate response. Therefore, this model is not robust against spelling or grammatical errors from the user
[10], and it may struggle with more complex tasks that require understanding context, nuances, and natural language, as they are unable to learn from new inputs or adapt to new situations. Rule-based chatbots are best suited for specific and repetitive tasks, such as customer service inquiries or FAQ support, that employ a question-answering chatbot
[13]. Singh et al.
[30] and Vishwakarma
[31] present examples of these types of chatbots.
Considering AI-based chatbots, virtual assistants are categorized based on their information processing and response generation techniques. This section discusses two types of chatbots: (i) retrieval-based and (ii) generative models. This information is relevant for selecting the appropriate virtual assistant architecture for a specific task, thus enhancing performance. Retrieval and generative models are the two main categories into which AI can be further classified. AI models are based on machine learning algorithms, allowing them to learn from an existing database of human conversations
[12].
Retrieval-based chatbots are the first type of AI-based chatbots, with an architecture that uses a predefined set of responses to generate a response to user input. They typically employ keyword matching and similarity measures to select the best answer from a predefined set. As a result, they only need a small amount of training data or limited use of complicated machine learning algorithms, and they can handle a wide range of user inputs. However, retrieval-based chatbots may need help with more complex tasks that require context, nuance, and natural language understanding. For example, suppose they are still looking for a suitable answer in their predefined set. In that case, they may respond with generic or unhelpful responses.
Retrieval-based chatbots are primarily designed to handle operational problems, particularly those related to troubleshooting, by efficiently retrieving and providing relevant information for customer support and other service-oriented tasks that require a prompt and accurate response
[10][28][32][33].
The second type of AI-based chatbot, generative chatbot, utilizes natural language generation (NLG) techniques to generate responses to user inputs. In contrast to rule-based and retrieval-based chatbots, generative chatbots do not rely on predefined responses or rules. Instead, they produce responses based on their understanding of the input and the context. They use machine learning algorithms, such as deep learning, to analyze large amounts of data and learn how to generate human-like responses. They can understand and create natural language queries and handle a variety of inputs, including ambiguous or incomplete entries. Generative chatbots often tackle more complex tasks, such as providing recommendations or advice, engaging in conversation, and entertainment. In Kapočiūtė-Dzikienė
[34], the applications of these chatbots are illustrated.
2.3. Dialogflow
Currently, Google provides two closed platforms for chatbot development: Dialogflow ES and Dialogflow CX. “CX” stands for Customer Experience, while “ES” stands for Essentials. Although both versions provide tools to design VA, Dialogflow CX has functionalities to construct more complex conversational systems.
Google Dialogflow CX allows the creation, management, and development of conversational applications, such as chatbots and virtual assistants. It is easy to use and enables the construction of complex interactions with the user. Dialogflow CX presents an innovative approach to agent creation using state machine-based design. The tool gives precise and defined control over the conversation, improving the end user experience and simplifying the development process
[15]. The following topics describe the components of Dialogflow CX
[15]:
-
Agents: These are responsible for processing the simultaneous conversations with the end user. An agent performs the natural language processing and understanding of the varieties of human language. It is based on a state machine architecture, which allows developers to explicitly and thoroughly control the conversation to create personalized and enhanced conversation experiences for their users. In addition, the agent allows adding context to the conversation, which can help improve the accuracy of the agent’s responses.
-
Flows: These define conversation topics and the associated conversation paths. Each agent has an initial default flow. A dialog flow represents a sequence of interactions between the user and the virtual assistant. These flows allow you to define a series of questions and answers to guide the conversation with the user and conditions to direct the conversation to different paths according to the user’s responses.
-
Pages: These group intentions into a more organized and manageable structure. They act as a container for related intents, allowing them to manage and edit intents more efficiently. Pages are beneficial in more complex conversations with many interconnected intentions and dialog flows. They help keep them organized and easy to manage, making it easier to edit and manage the conversation model. Pages also allow for the sharing of intentions and dialog flows between multiple virtual assistants or applications, which is especially useful when consistency across numerous user communication channels is desired.
-
Event-Handler: In Dialogflow, event-handlers are actions triggered by the user’s entry or for an event in the application's backend. It uses an event-handler to give a user response every time a piece of information is not matched. This could be an intent or entity.
-
Intents: These represent the user’s intent in a conversation. They are defined based on sample input sentences that the user may submit and help Dialogflow understand what the user is trying to achieve or ask. Each intent represents a specific action the user wants to perform, such as making a reservation at a restaurant, getting weather information, or asking for help with a task. Dialogflow uses these intentions to steer the conversation with the user toward the correct answer. In addition, intents can also include variables, known as parameters, that help capture specific information from the user, such as the date or location of a reservation. These variables allow Dialogflow to personalize its response to the user, making the conversation more effective and relevant. Finally, intentions in Dialogflow enable the model to understand what the user is trying to achieve or ask and steer the conversation to the correct response based on the customization of variables.
-
Entities: These represent specific and relevant information for the conversation, such as names of people, places, dates, and times. They extract relevant information from the user’s input sentences and store them as variables, which helps customize the virtual assistant’s responses. There are many types of predefined entities in Dialogflow, such as date, time, number, address, and custom entities; they represent specific information for an application. Additionally, entities can also be utilized to define patterns of phrases and expected behaviors for different types of information, such as names of people or addresses. This helps Dialogflow understand user input sentences more accurately and efficiently.
-
Contexts: These help to maintain the conversation context between the user and the chatbot. They are utilized to remember relevant information and make it available throughout the conversation. For example, if the user is having issues with an ATM, the conversation context can include information about the type of transaction they are trying to perform and the date and time of the issue. This information can be employed to provide a more precise and relevant solution to the user. Contexts are also fundamental for controlling the flow of the conversation and preventing the virtual assistant from getting stuck on a specific task or question. For instance, if the user changes the subject, the context can be altered to reflect the new topic of the conversation.
2.4. Evaluation of Chatbots
There is no established standard evaluation protocol for chatbots. A brief investigation was conducted to define the evaluation strategies. There are several strategies to measure the performance of chatbots. They can be one of two categories: human evaluation, which involves analyzing user feedback, and automated evaluation, which uses metrics such as precision, recall, and F1-score to evaluate the chatbot’s classification abilities. Even though human evaluation is valuable, it can be expensive and time-consuming
[12].
Another metric is the exact match accuracy, which evaluates the IR and NER together
[35]. This metric refers to the number of times a chatbot correctly identifies the user’s intent and any relevant entities mentioned in their input as compared to the total count of inputs.
The chatbot background is the IR and NER algorithms that can be evaluated as any other machine learning classifier. The metrics for automated evaluation utilize the frequency of true positives (𝑇𝑃𝑠), true negatives (𝑇𝑁𝑠), false positives 𝐹𝑃𝑠, and false negatives (𝐹𝑁𝑠), which are terms used to describe the occurrence of correct (𝑇𝑃) and incorrect (𝐹𝑃) predictions for a given class.
3.1. General Chatbot Applications
One of the most significant contributions to the development of chatbots with NLP features was the Eliza program, created by Weizenbaum in 1966
[36]. The operation of this chatbot involves analyzing keywords within the user’s input and matching them against predefined rules. While this approach made a notable contribution for its time, it had limitations in understanding contextual information. Other programs that became part of the first generation of chatbots include Parry
[37] and A.L.I.C.E
[38]. In Singh and Thakur
[29], yeung Shum et al.
[39], the authors demonstrate the increasing popularity of this technology that has been seen in multiple applications created since then. The rising generation of digital assistants has been applied in distinct scenarios, such as e-commerce, e-learning, healthcare, and the industrial sector, to improve efficiency and enhance user experience.
In the e-commerce industry, brands are adopting chatbots to increase and engage customers, as evidenced in
[40][41]. In the research conducted by Khan
[42], an e-commerce sales chatbot platform is proposed to provide customer support. The project comprises five distinct modules, each playing a pivotal role in the overall system. Notably, the NLU engine assumes a particularly critical position within this framework. While this work provides a comprehensive overview of the system architecture, it, unfortunately, lacks detailed information regarding the performance evaluation of the NLU engine utilized in the chatbot. A comprehensive evaluation encompassing the entire system would be highly valuable in order to gain deeper insights into its overall effectiveness and capabilities.
By reducing the need for human presence in situations involving the COVID-19 pandemic, as shown by Amiri and Karahanna
[43], the assistance capacity of this tool is expressed in various use cases ranging from scheduling vaccines to disseminating information about the coronavirus.
E-learning
[44] is a promising area where a chatbot can be applied. Huang et al.
[45] extensively reviewed the impact and relevance of integrating chatbots in education, highlighting the technological affordances that make them an attractive tool for the field. Instances of this application can be seen in the industrial sector, where chatbots have been acting as instructors for workers. Casillo et al.
[46] propose an approach regarding the training of new workers carried out by a virtual assistant prototype and reports results demonstrating that this system’s advantage is the reduction of the learning process. In this aspect, Colabianchi et al.
[47] also present a point of view related to the role of chatbots as mentors with Popeye, a chatbot that trains new employees in container inspection.
3.2. Troubleshooting Chatbots
The domain of maintenance represents a field where chatbots have been employed. Troubleshooting chatbots are designed to identify and solve technical issues. They aim to provide efficient and effective technical support by identifying and addressing common problems.
Chatbots as technical assistants is a topic explored by Alhassan et al.
[48], in which the authors proposed a methodology for an IT chatbot framework to assist common IT problems. Despite the demonstrated merits of incorporating chatbots in the troubleshooting domain, this study lacks empirical data pertaining to the chatbot’s efficacy in effectively discerning and comprehending user descriptions to receive the appropriate procedure.
Another application of chatbots regarding the maintenance process can be seen in
[49] where the chatbot, Telmi, is responsible for providing customer support for troubleshooting tasks. This work shows the number of issues resolved by the chatbot and the characteristics of the problems that were not. One of the traits that characterized the issues that the chatbot could not address was the difficulty in identifying the intent resulting in a higher level of FN. This result highlights the importance of creating a well-constructed training dataset to maximize evaluation metrics effectively.
Previous works only presented the accomplished results without explaining the methodology of transforming the knowledge database into a chatbot with artificial intelligence
[42][46][47][48][49]. Additionally, the works are poor in evaluation metrics. Finally, prior research does not cover the experimental protocol to assess chatbots
[23][50].
To the best of researchers' understanding of the methodologies for developing chatbots, transmuting an automated teller machine (ATM) maintenance knowledge base into a chatbot remains largely uncharted territory
[12][51][52]. This presents numerous opportunities for further exploration and innovation in this sphere
[46]. Moreover, existing methodologies have been predominantly tailored for different application contexts
[47], resulting in a noticeable scarcity of direct comparisons in the context of ATM maintenance
[48][49]. Consequently, it is imperative to conduct a thorough examination of the available methodologies
[46][47] and to evaluate the ramifications of adapting these methodologies within the specific milieu of ATM maintenance
[42][50].
The research work of
[53] focuses on meticulously modeling a specific process within a business process model and notation (BPMN) framework and subsequently transforming it comprehensively into a functional chatbot. The primary transformation pipeline encompasses several key components. Initially, a graph normalization block is employed to load the BPMN diagram and ensure the structural integrity of the graph. Subsequently, a label processing block gathers the linguistic information inherent to the model. Furthermore, a dialog graph construction block generates finite state automata (FSA) based on the BPMN model governing conversational transitions. A natural language generator (NLG) produces chatbot responses by generating text based on each node of the BPMN model. Finally, the encoded FSA is integrated into an AIML engine for subsequent processing. While the methodology for chatbot development is commendable, its application to an ATM maintenance knowledge base reveals significant drawbacks.
The current structure of a bank agency knowledge base, primarily comprising text-based procedures and specific information, necessitates better alignment with the node-based representation of the BPMN model. This misalignment poses challenges in accurately mapping content, resulting in incomplete or oversimplified representations within the chatbot. Additionally, the BPMN model’s lack of granularity limits its ability to capture intricate troubleshooting steps, diminishing the effectiveness of the generated chatbot. Another noteworthy drawback to consider is the inherent limitation in capturing and preserving critical information from the conversation efficiently. Such content registration holds utmost significance, as it plays a pivotal role in the maintenance process, precisely determining the appropriate execution of procedures based on the key elements discussed within the conversation.
Although the application does not primarily focus on the educational field, it is pertinent to acknowledge that existing methodologies employed in developing educational chatbots incorporate the concept of a virtual assistant. A notable example can be found in the work of
[54], where a methodology for constructing a conversational chatbot tutor is proposed. This approach leverages the application of first-order logic, which serves as a formal language specialized in inference and symbolic representation of knowledge, to delineate the chatbot’s knowledge base. Nonetheless, the utilization of this methodology for implementing and troubleshooting chatbots within the specific context would present inherent challenges. The inadequacy of first-order logic in comprehensively capturing the intricacies and complexities inherent in natural language poses a significant hurdle. Given that the maintenance process necessitates interactions with users who may articulate their issues or queries differently, first-order logic, although proficient in representing rudimentary relationships and logical operations, may encounter difficulties when faced with more intricate or ambiguous scenarios. Consequently, this limitation could impede the chatbot’s ability to handle diverse user queries and effectively provide appropriate responses.
The third analyzed methodology was the work of
[55], which demonstrates the integration of an ontology and a knowledge base to develop an efficacious programming assistant chatbot. First, the authors introduce the Rela-Model, an ontology that organizes the knowledge base, precisely delineating the interrelations between programming concepts and the pre-established rules governing inferential processes. Then, to support its functionality, this model is amalgamated with a meticulously structured repertoire of scripts, encompassing five key components: script nomenclature, related content, an array of interrogatives and corresponding responses, and a set of rules meticulously employed for question selection within the script for the chatbot. This systematic procedure forms the bedrock of knowledge base construction for a question-and-answer-oriented chatbot. Furthermore, it is pertinent to note that chatbots reliant upon ontology-based systems often require users to articulate queries utilizing terminology or expressions that align with the ontology’s lexical framework.
3.3. Banking Assistants
Virtual assistants are increasingly being utilized to provide customers with prompt and personalized responses to their queries, enhancing the overall customer experience
[56]. Therefore, recent research has focused on developing and improving virtual banking assistants to serve customers better.
Several scholars have proposed and evaluated dialogue systems in the banking industry to improve customer service and satisfaction. Rustamov et al.
[57] developed and evaluated several dialog management pipelines to verify the most suitable one for application in a banking context. The primary objective of this research was to investigate the dialogue manager and NLU components. A comprehensive experiment was conducted involving the utilization of fastText, an open-source library widely employed for text classification, and custom machine learning models trained on a specialized banking dataset. The experimental findings clearly indicated that the custom machine learning models outperformed fastText in terms of accuracy.
Another field of research concerns investigating the impact of virtual banking assistants on customer engagement and satisfaction to comprehend the significant role that chatbots play in the banking industry, as evidenced in
[58]. Furthermore, customer acceptance is a crucial factor to consider when implementing chatbots in the banking industry. Alt et al.
[59] have identified several key factors that can increase chatbots’ appeal to customers and enhance their usage. While their work effectively highlights the substantial presence of virtual banking assistants catering to customer needs, it regrettably overlooks the mention of any applications specifically pertaining to troubleshooting chatbots for agent banks. This conspicuous absence suggests that the domain of this category of chatbots remains largely unexplored, presenting an as-yet uncharted domain, which holds promise for future research and development endeavors.
The applications mentioned above demonstrate the indispensability of chatbots in the banking industry. With their ability to provide personalized and efficient customer service, chatbots are valuable assets in improving customer satisfaction and loyalty.
3.4. Virtual Assistants Using Dialogflow
Dialogflow has been widely applied in various scenarios due to its diverse range of tools that enable developers to construct complex conversational agents.
Dhavan
[21] developed a chatbot system with Dialogflow ES to detect possible heart attack symptoms through the user-provided description, which was collected using entity parameters. The user provides inputs regarding factors such as the severity of chest pain or breathing difficulties, among other pertinent details. After the patient submits the descriptions of their symptoms, the chatbot prompts the user to complete a questionnaire. Subsequently, these data are forwarded to a support vector machine model, which carries out the prediction.
The evaluation metrics employed were accuracy (0.824), precision (0.843), and recall (0.843). Although the results appear promising, it is imperative to emphasize the criticality of ensuring precise responses for a medical tool to provide accurate responses. Therefore, the metrics results should exceed a threshold of 0.9. Furthermore, the intents and entities utilized to retain the conversation context could also have been evaluated, as the chatbot’s ability to deliver appropriate responses to the user is contingent upon the effective performance of these evaluative measures.
Muhammad et al. [63] demonstrate a conversational tool that facilitates English language learning for students, utilizing Dialogflow ES and incorporating an English conversation book as its knowledge base. Additionally, entity extraction techniques were utilized to retain essential information from user responses. Dall’Acqua and Tamburini
[60] employed the latest Dialogflow (Dialogflow CX) to develop a conversational agent capable of delivering standard customer services such as subscription, instruction for download, and discounts after receiving a user query. To recognize user input, this author used intent recognition and NER to collect relevant information and define the conversational context.