Human-centered Machine Learning (HCML) is about developing adoaptable and usable Machine Learning systems for human needs while keeping the human/user at the center of the entire product/service development cycle.
Human-Centered Machine Learning, also referred to as Human-Centered Artificial Intelligence (HCAI or HAI), is gaining popularity due to the concerns raised by influential technology firms and research labs about the human context. A workshop in conjunction with the Conference on Human Factors in Computing Systems in 2016 [1] explained that HCML should explicitly recognize the human aspect when developing ML models, re-frame machine learning workflows based on situated human working practices and explore the co-adaptation of humans and systems. In early 2018, Google Design (https://design.google/library/ux-ai/, accessed on 1 April 2021) published an article noting that HCML is the User Experience (UX) of AI. Referring to a real consumer ML product, Google highlighted how ML could focus on human needs while solving them in unique ways that are only possible through ML. Several research projects (https://hcai.mit.edu/, accessed on 1 April 2021) by the Massachusetts Institute of Technology (MIT) on self-driving technologies called their approach Human-Centered Artificial Intelligence. The MIT team recognized both the development of AI systems that are continuously learning from humans and the parallel creation and fulfillment of a human-robot interaction experience. In 2019, the Stanford Institute for Human-Centered Artificial Intelligence (https://hai.stanford.edu/, accessed on 1 April 2021) was initiated with the goal of improving AI research, education, policy, and practice. They recognized the significance of developing AI technologies and applications that are collaborative, augmentative, and enhance human productivity and quality of life. A workshop (https://sites.google.com/view/hcml-2019, accessed on 1 April 2021) held in 2019 with the Conference on Neural Information Processing Systems for Human-Centered Machine Learning focused on the interpretability, fairness, privacy, security, transparency, accountability, and multi-disciplinary approach of AI technologies. Started in 2017, Google People + AI Research initiative (https://pair.withgoogle.com/, accessed on 1 April 2021) published a 2019 book presenting guidelines for building human-centered ML systems. This team is researching the full spectrum of human interactions with machine intelligence to build better AI systems with people.
Considering the scope of the HCML/HCAI prior work and publications by leading industry and academic institutions, we derived a definition for HCML that covers the breadth of this existing work. We validated the definition using feedback from several researchers working in the same domain and further validated with some influential researchers in leading academia and industrial institutions’ Human-Centered AI research teams.
Human-centered Machine Learning (HCML): Developing adaptable and usable Machine Learning systems for human needs while keeping the human/user at the center of the entire product/service development cycle.
There is a natural incentive to research all the principles mentioned previously; however, this is seldom achieved in practice. In individual research, the entire development life-cycle is only partially detailed, possibly due to emphasis on the focused technicalities of the research. Therefore, we selected research that demonstrated one or more design elements matching the above definition of HCML research.
As shown in Figure 1, HCML work lies across many aspects of Machine Learning. We define algorithmic work related to HCML as Back-End HCML and work with interactions with humans as Front-End HCML. We excluded algorithm-centric back-end HCML papers as it would divert our focus away from the baseline HCML concepts. For instance, analyzing and classifying explainability algorithms is beyond this paper’s scope and may be reviewed in separate works, such as [2][3]. However, algorithmic contributions with Front-End HCML practices, such as user evaluations, were included.
Figure 1. Human-Centered Machine Learning (marked with dashed lines) research lays over a broad spectrum, as shown here. The intersection of Machine Learning Research and Human-Centered Design is the domain we identify as Human-Centered Machine Learning.
Among the HCML literature, this category pertains to research that compiles design guidelines and principles for HCML or provides assistance to build HCML products and services. These works stem from different intentions, such as guidelines for developing intelligent user interfaces, visualization, prototyping, and human concerns in general. Amershi et al. [4] present a set of guidelines resulting from a comprehensive study conducted with many industry practitioners that worked on 20 popular AI products. Some approaches have been focused on deriving requirements and guidelines for planned sandbox visualization tools [5]. One article highlights guidelines related to three areas of HCML, ethically aligned design, tech that reflect human intelligence, and human factors design [6]. Browne et al. [7] proposed a wizard of oz approach to bridge designers with engineers to build a human-considerate machine learning system targeting explainability, usability, and understandability. Some papers attempt to identify what HCML is [8], and discuss how AI systems should understand the human and vice versa. Apart from general perspectives, Chancellor et al. [9] attempted to analyze literature in the mental health-AI domain to understand which humans are focused on such work and compile guidelines to maintain humans as a priority. In a slightly different layout, Ehsan et al. [10] attempted to uncover how to classify human-centered explainable AI in terms of prioritizing the human. Wang et al. [11] also tried to design theory driven by a user-centered explainable AI framework and evaluate a tool developed with actual clinicians. Schlesinger et al. [12] explored ways to build chatbots that can handle ‘race-talk’. Long et al. [13] attempts to define learner-centered AI and figure out design considerations. Yang et al. [14] explore insights for designers and researchers to address challenges in human–AI interaction.
The main component of HCML is the Human and thus elevating the significance of the human. The ‘Human’ in HCML is defined across varying ML expertise levels, ranging from no ML background to an expert ML scientist. The Human in HCML can also be involved in various stages of the ML system development process in different capacities. For instance, the focus may be on the end-user, the developer, or the investor. One could focus on a certain user-aspect when developing a product or service [15][16]; another could be determining design principles for a particular ML system optimizing usability and adoptability [17][18][4]. The multidimensionality of what is considered Human within HCML contributes to the complexities within the field.
Considering the works that focused on the user side, some researchers catered to general end-users or consumers [15][19][8][20], while others on specific end-users. Examples for these include people who need assistance [21][22][23][24][25][26][27][28][29], medical professionals [30][31][32][33][34], international travelers [35], Amazon Mechanical Turk [36][37], drivers [38][39], musicians [40], teachers [41], students [42], children [43][44], UX designers [18][45][14][46], UI designers [47][48][49], data analysts [50], video creators [51], and game designers [52][53][54][55]. Apart from focusing on a specific user group, some have tried to understand multiple user-perspectives from ML engineers to the end-user [17]. Some of the prior works that target the developer as the human focus on novice ML engineers to help them develop ML systems faster [56][5]. Notably, the majority of works that target the developer side focused on ML engineers [57][18][58][59][60][61][62][16][63][64][4][6][7][9][65].
Machine Learning works well in many scenarios provided that a relationship exists between the task at hand and the availability of data. This power of making decisions or predictions based on data has empowered ML to infiltrate many other domains, such as medicine, pharmacy, law, business, finance, art, agriculture, photography, sports, education, media, military, and politics. Given that the majority in those sectors are not AI experts developing AI systems for them requires us to investigate the human aspect of such systems. Our analysis shows that application domains have specifically targeted gaming [66][52][53][54][55], interactive technologies [67][68][69][70][71][72][73][74][75][76][77][78][79][80][81][82][83], medicine [84][30][31][32][85][86][33][34][11][87][88], psychiatry [89], music [18][15][90][40][80], sports [91], dating [36], video production [51], assistive technologies [21][22][23][25][26][27][28][92][29][93][94], education [41][44][42][95][96], and mainly software and ML engineering [17][57][56][59][60][48][61][62][97][63][64][5][65][98] based on our selected literature.
Features of AI models addressing the concerns of users to improve the usability and adoptability of AI systems such as explainability, interpretability, privacy, and fairness have been the focus of many HCML related work [99][100][101][102][103][104][105][106][107][108][109]. This is not surprising, given the history of XAI research area dates back to 1980s [110][111]. In a comprehensive study, Bhatt et al. [17] investigated how explainability is practiced in real-world industrial AI products and presents how to focus explainability research on the end-user. Focusing on game designers, Zhu et al. [55] discuss how explainable AI should work for designers. A study [20] used 1150 online drawing platform users and compared two explanation approaches to figure out which approach is better. Ashktorab et al. [112] explored explanations of machine learning algorithms concerning chatbots. Although explainability is not the main focus, some research [84][7] investigated the explainability aspect when developing ML systems. Another work [10] tried to investigate who is the human in the center of human-centered explainable AI. In addition, there exists work that tried to bring a user-centered approach to XAI research [113][11]. Chexplain [32] worked on providing an explainable analysis of chest X-rays to physicians. Das et al. [114] attempted to improve humans’ performance by leveraging XAI techniques.
While explainability tries to untangle what is happening inside the Deep Learning black boxes, interpretability investigates how to make AI systems predictable. For instance, if a certain neural network classifies an MRI image as cancer, figuring out how the network makes such a decision falls into explainability research. However, an attempt to build a predictable MRI classification network where a change of network’s parameters results in an expected outcome falls into interpretability research. There have been attempts [115][116][117] to develop novel interpretability algorithms using human studies to validate if those algorithms achieved the expected results. Isaac et al. [118] studied what matters to the interpretability of an ML system using a human study. Another study [59] figured out that ML practitioners often over-trust or misuse interpretability tools.
Apart from these two common DL features, some other work considered the aspects of fairness [119][57][58][120][8][9][121], understandability [33][5][7][8], and trust [36][60][122][9]. Fairness represents the degree of bias in decisions, such as gender and ethnic skews, that influence the predictive model. For instance, gender and ethnic biases in the models can cause serious impacts on certain tasks. Understandability is a slightly different feature from explainability. While explainability shows how a model makes a certain decision, understandability tries to show how a neural network works to achieve a task. Trust refers to a subjective concern where the user’s trust towards the decisions made by a certain model is studied.