1. The Distinctive Nature of Generative AIs
GAIs, notably but not exclusively in the form of large language models (LLMs), have now developed to a point that their output closely resembles and often exceeds what humans could do unaided, performing tasks that appear to be the result of similar soft cognitive processes. In fact, this is because that is, to a large extent, almost exactly what they are. The “intelligence” of LLMs is almost entirely composed of the reified soft creations of the (sometimes) hundreds of millions of humans whose data made up their training sets albeit that it is averaged out, mashed up, and remixed. LLMs are essentially a technological means of mining and connecting humans' collective intelligence
[1] .
For more than a decade, conversational agents have been available that, within a constrained context, have regularly fooled students that they are human, albeit making sometimes embarrassing or harmful mistakes due to their hitherto relatively limited training sets
[2] and seldom fooling the students for very long. The main thing that has changed within the past few years is not so much due to the underlying algorithms or machinery, though there have been substantial advances (such as transformers and GPU improvements), but to the exponentially increasing size of the language models. The larger the training set, the greater the number of layers and vectors, and the larger the number of parameters, the more probable that the model will not only be able to answer questions but do so accurately and in a human-like way. Their parameters (directly related to the number of vectors and layers) provide an approximate measure of this. Open AI’s GPT-3, released in 2022, has around 175 billion parameters, while Google’s slightly earlier BERT has “only” 340 million. However, both are dwarfed by GPT-4, released in 2023, which is estimated to use closer to 100 trillion parameters, being trained on a data set representing a non-trivial proportion of all recorded human knowledge
[3]. It is because of this that modern LLMs appear to be capable of mimicking and, in many cases, that the quality of their outputs exceed all but the highest achievements in human cognition including inference
[4] and creativity
[5][6].
Some (e.g.,
[7][8]) have even tried to make the case that a GAI such as ChatGPT-4 is now at least close to being an AGI (artificial general intelligence), using measures of human intelligence and creativity as evidence. This is false, for reasons that will matter in the discussion that follows. These measures were chosen by researchers to determine the extent to which a
human is intelligent or creative; they rely on indicators that usually correlate with what people normally recognize as intelligent, creative behaviour in a human being. In so doing they assume, as a baseline, that the agents they are testing
are both creative and intelligent, so the tests are a means to compare one human with another on a scale, and are not absolute standards and certainly not a proxy for the cognitive skills themselves.
To measure something requires there to be attributes that can be defined precisely enough to measure. Unfortunately, both intelligence and creativity are extremely fuzzy culturally embedded concepts with meanings that shift according to context and that drift over time
[9]. People know them when they see them but, if called upon to define them, they invariably come up with definitions that are too narrow or too broad, and that admit exceptions or that include things they would not see as anything similar to their own. This is inevitable because intelligence and creativity are identified by family resemblances
[10], not a fixed set of defining characteristics. People see in others signals of aspects they see in themselves, recognizing shared physical and behavioural characteristics, and then extrapolate from these observations that they emerge from the same kind of entity. The signals are, however, not the signified. The meanings people give to “intelligence” or “creativity” are social constructions representing dynamic and contextually shifting values, not fixed natural phenomenon like the boiling point of water or gravity. In them people find reflections of their own ever-evolving and socially constructed identities, not laws of nature. While general inferences can be made from correlational data, they cannot reliably predict behaviour in any single instance
[11]. Tests of intelligence or creativity are broadly predictive of what humans recognize as intelligent or creative behaviour, but they are highly susceptible to wide fluctuations at different times that depend on many factors such as motivation, emotion, and situation
[12].
Just because the output of an LLM closely resembles that of a human does not mean it results from the same underlying mechanisms. For instance, some of an LLM’s apparent creative ability is inherent in the algorithms and data sets it uses; LLMs have vastly greater amounts of reified knowledge to draw from than any individual human, and the fact that they can operate at all depends on their capacity to connect and freely associate information from virtually any digital source, including examples of creativity. If this is how "creativity" is defined then, of course, they can be very creative. It is, though, inappropriate to directly compare the intelligence, wisdom, or creativity of AIs and humans, at least in their current forms, because, even if some of the underlying neural nets are analogous to their own, they are not like people, in ways that matter when they are a part of the fabric of the cognitive, social, and emotional development of human beings.
Unlike humans, the current generation of LLMs have not learned about the world through interactions with it, as independent and purposeful agents interacting with other independent and purposeful agents. Their pasts are invented for them, by humans, and their purposes are the purposes of their users, not their own. Although people might metaphorically describe their behaviours as goal-seeking, this is because that is how they are programmed, not because they possess goals themselves. LLMs have no intentions, nothing resembling consciousness, no agency, and no life history. They have no meaningful relationships with people, with one another, or with the tokens they unknowingly assemble into vectors. Though there may be much sophistication in the algorithms surrounding them, and impenetrable complexity in the neural networks that drive them, at their heart they just churn out whatever token (a word, a phrase, musical notes, etc.) is most likely to occur next (or, in some systems, whatever comes previously, or both), given the prompt they are given.
Perhaps something similar is true of human beings; people certainly make decisions before being conscious of having done so and many if not all of their intentions are pre-conscious
[13]. Also, like us, LLMs are prediction machines
[14] and they do appear to make such predictions in a similar manner. However, as Clark
[15] argues, it is not possible to jump from this to a full explanation of human thought and reason, let alone intentional behaviour. Even if there are closer similarities with human minds, the stuff that such minds deal with is fundamentally different. Most significantly and unsurprisingly, because all it has learned has been the processed signals humans (mostly intentionally) leave in the digital world, an LLM is nothing
but signals, with nothing that is signified underneath. The symbols have no meaning, and there is no self to which they could relate. Current systems have no concept of whether the words or media they churn out make sense in the context of the world, only whether they are likely to occur in the context of one another. If part of their output is a hallucination, then all of it is. The machines have no knowledge, no concepts, and no sense of how anything works in the context of a self because there is no identity, no purposive agent, and no being in the world to which the concept could relate. This may change as embodied AIs become more common and sophisticated but, even then, unless perhaps they are brought up like humans in a human society (a possibility fraught with huge ethical and practical concerns), they will be utterly unlike us.
Some might argue that none of this is important. If it walks like a duck, squawks like a duck, and flies like a duck then, to all intents and purposes, it might as well be called a duck. This is, again, to mistake the signal for the signified. While the output of an LLM may fool us into thinking that it is the work of an actual human, the creative choices people most value are expressions of their identity, their purposes, their passions, and their relationships to other people. They are things that have meaning in a social context, and are things that are situated in their lives and the lives of others. It matters so much, for example, that a piece of work was physically written by Gustav Mahler that someone was willing to pay over USD 5m for the handwritten score of his Second Symphony. People even care about everyday objects that were handled by particular humans; an inexpensive mass-produced guitar used by John Lennon in some of his early songwriting, for instance, can sell for roughly USD 2.4m more than one that was not. From a much loved piece of hand-me-down furniture to the preservation of authorship on freely shared Creative Commons papers, technologies’ value lies as much as or more than in their relationship to us, and how they mediate relationships between us, as in their more obvious utilitarian functions. More prosaically, people are normally unwilling to accept coursework written by an AI when it is presented as that of a student, even though it may be excellent, because the whole point is that it should have contributed to and display the results of a human learning process. This is generalizable to all technologies; their form is only meaningful in relationship to other things, and when humans participate in the intertwingled web that connects them. It is not just our ability to generate many ideas but our ability to select ones that matter, to make use of them in a social context, to express something personal, and to share something of ourselves that forms an inextricable part of their value. The functional roles of their technologies, from painting techniques to nuts and bolts to public transit systems, are not ends in themselves; they are meant to support people in their personal and social lives.
Despite appearances, AGI is now little closer than it was 10 years ago. In fact, as Goertzel
[16] observed back then, people still struggle to define what “intelligence” even means. The illusion of human-like intelligence, though, being driven by the reified collective knowledge of so many humans and, for most large models, being trained and fine-tuned by tens or hundreds of thousands more, is uncanny. To a greater extent than any previous technology, LLMs black-box the orchestration of words, images, audio, or moving images, resulting in something remarkably similar to the soft technique that was hitherto unique to humans and perhaps a few other species. Using nothing but those media and none of the thinking, passion, or personal history that went into making them, they can thus play many soft, creative, problem-solving, generative roles that were formerly the sole domain of people and, in many cases, substitute effectively for them. More than just tools, people may see them as partners, or as tireless and extremely knowledgeable (if somewhat unreliable) coworkers who do so for far less than the minimum wage. Nowhere is this more true, and nowhere is it more a matter of concern, than in the field of education.
2. GAIs and Education
The broader field of AI has a long history of use in education for good reason. Education is a highly resource-intensive activity demanding much of its teachers. It has long been known that personal tuition offers a two-sigma advantage when compared with traditional classroom methods
[17] but, for most societies, it is economically and practically impossible to provide anything close to that for most students. There is therefore great appeal to automating some or all of the process, either to provide such tuition or to free up the time of human teachers to more easily do so. The use of automated teaching machines stretches back at least 70 years
[18][19], though it would be difficult to claim that such devices had more than the most rudimentary intelligence. AIs now support many arduous teaching roles. For instance, since at least as long ago as the 1990s, auto-marking systems using statistical approaches to identify similarity to model texts
[20], or latent semantic analysis with examples trained using human-graded student work
[21], have been able to grade free-text essays and assignments at least as reliably and consistently as expert teachers. For at least 20 years, some have even been able to provide formative feedback, albeit normally of a potted variety selected from a set of options
[22]. Use of intelligent tutoring systems that adapt to learner needs and that can play some (though never all) roles of teachers, such as selecting text, prompting thought or discussion, or correcting errors, goes back even farther, including uses of expert systems
[23], adaptive hypermedia that varies content or presentation or both according to rules adapted to user models
[24], as well as rule-based conversational agents (that might now be described as bots) mimicking some aspects of human intelligence from as far back as the 1960s, such as Coursewriter
[25], ELIZA
[26], or ALICE
[27][28]. Discriminative AIs performing human-like roles of classification have seen widespread use in, for example, analyzing sentiment in a classroom
[29], identifying engagement in online learning
[30], and identifying social presence in online classes
[31]. From the algorithms of search engines such as Google or Bing to grammar-checking, autocorrect, speech-to-text, and translation tools, the use of AIs of one form or another for performance support and task completion has been widespread for at least 25 years, and nowhere more than in education.
For all of the sometimes mixed benefits AIs have brought, and for all of the ways they have benefited students and teachers, until now they have been tools and resources that are parts of their own orchestrations, not orchestrators in their own right. They had neither the breadth of knowledge nor the range of insight needed to respond to novel situations, to act creatively, or to fool anyone for long that they are human. Now that this is possible, it has opened up countless new adjacent possibilities. There has been an explosion of uses and proposed uses of GAIs in education, both by students and by teachers, performing all these past roles and more
[32][33]. For teachers, GAIs can augment and replace their roles as Socratic tutors, providers of meaningful feedback, participants in discussions, and curriculum guides
[34][35]. For students they can write assignments, perform research, summarize documents, and correct improper use of language
[36]. These examples merely scratch the surface of current uses.
The effects of GAIs on their educational systems have already been profound. At the time of writing, less than a year after the meteorically successful launch of ChatGPT, recent surveys suggest that between 30% (
https://www.intelligent.com/nearly-1-in-3-college-students-have-used-chatgpt-on-written-assignments/ accessed on 25 November 2023) and 90% (
https://universitybusiness.com/chatgpt-survey-says-students-love-it-educators-not-fans/ accessed on 25 November 2023) of students are using it or its close cousins to assist with or often write their assessed work. Teachers, though mostly slower to jump on the bandwagon, are using these tools for everything from the development of learning outcomes and lesson plans to intelligent tutors who interact with their students, and they are scrambling to devise ways of integrating GAIs with curricula and the course process
[33]. Already, in some cases it may therefore be the case that the bulk of both the students’ and the teachers’ work is done by a GAI. This has a number of significant implications.
Teachers, be they human or AI, are not only teaching the pattern of the cloth; they are teaching how to be the loom that makes it or, as Paul
[37] puts it, the mill as well as the grist of thought. Although the language of education is typically framed in terms of learning objectives (what teachers wish to teach) and learning outcomes (what it is hoped that students will learn), there is always far more learning that occurs than this; at the very least, whether positive or negative, students learn attitudes and values, approaches to problem solving, ways of thinking, ways of relating to others in this context, motivation, and ways of understanding. It is telling, for instance, that perceived boredom in a teacher results in greater actual boredom in students
[38]. Similarly, approaches to teaching and structural features of educational systems that disempower learners create attitudes of acquiescence and detract from their intrinsic motivation to learn
[39][40][41]. Equally, the enthusiasm of a teacher plays an important role in improving both measured learning outcomes and attitudes of students towards a subject
[42][43]. Such attitudinal effects only scratch the surface of the many different kinds of learning, ways of connecting ideas, and ways of being that accompany any intentional learning that involves other people, whether they are designated teachers, authors of texts, or designers of campuses. Often, teachers intentionally teach things that they did not set out to teach
[44]. There are aspects of social and conceptual relationships and values that matter
[40], idiosyncratic ways of organizing and classifying information, ethical values expressed in actions, and much, much more
[45]. There is a hidden curriculum underlying all educational systems
[46] that, in part, those educational systems themselves set out to teach, that in part is learned from observation and mimicry, and that in part comes from interacting with other students and all of the many teachers, from classroom designers to textbook authors, who contribute to the process, as well as all the many emergent phenomena arising from ways that they interact and entwine. Beyond that, there is also a tacit curriculum
[47] that is not just hidden but that cannot directly be expressed, codified, or measured, which emerges only through interaction and engagement with tasks and other people.
The tacit, implicit, and hidden curricula are not just side-effects of education but are a part of its central purpose. Educational systems prepare students to participate in the technologies of their various cultures in ways that are personally and socially valuable; they are there to support the personal and social growth of learners, and they teach us how to work and play with other humans. They are, ultimately, intended to create rich, happy, safe, caring, productive societies. If the means of doing so are delegated to simulated humans with no identity, no history, no intention, no personal relationship, and with literally no skin in the game, where a different persona can be conjured up through a single prompt and discarded as easily, and where the input is an averaged amalgam of the explicit written words (or other media) of billions of humans, then students are being taught ways of being human by machines that, though resembling humans, are emphatically not human. While there are many possible benefits to the use of AIs to support some of the process, especially in the development of hard technique, the long-term consequences of doing so raise some concerns.
The End and the Ends of Education
It is the dawn of an AI revolution to which people bring what and how people have learned in the past, and so—like all successful new technologies—they see great promise in the parts of us and the parts of our systems they can replace. All technologies are, however, Faustian bargains
[48] that cause as well as solve problems, and the dynamics of technological evolution mean that some of those problems only emerge at scale when technologies are in widespread use. Think, for example, of the large-scale effects of the widespread use of automobiles on the environment, health, safety, and well-being.
Generative AIs do not replace entire educational systems; they fit into those that already exist, replacing or augmenting some parts but leaving others—usually the harder, larger-scale, slower-changing parts, such as systems of accreditation, embedded power imbalances, well-established curricula, and so on—fully intact, at least for now. They are able to do so because they are extremely soft; that is, perhaps, their defining feature. Among the softest and most flexible of all technologies in educational systems are pedagogies (methods of teaching). Though pedagogies are the most critical and defining technologies in any assembly intended to teach, they never come first because they must fit in with harder technologies around them; in an institutional context, this includes regulations, timetables, classrooms or learning management systems, the needs of professional bodies, assessment requirements, and so on
[49]. Now that people have machines that can play those soft roles of enacting pedagogies, they must do so in the context of what exists. Inevitably, therefore, they start by fitting into those existing structures rather than replacing them. This is, for example, proving to be problematic for teachers who have not adapted their slower changing assessment processes to allow for the large-scale use of LLMs in writing assignments, although such approaches have long been susceptible to contract cheating, including uses of sites such as CourseHero to farm out the work at a very low cost. It is telling that a large majority of their uses in teaching are also meant to replace soft teaching roles, such as developing course outlines, acting as personal tutors, or writing learning outcomes. The fact that they can do so better than an average teacher (though not yet as well as the best) makes it very alluring to use them, if only as a starting point. The fact that they are able to do this so well, however, speaks to the structural uniformity of so many institutional courses. The softness that GAIs emulate means that it is not quite a cookie-cutter approach, but the results harden and reinforce norms. This is happening at a global scale.
Right now, for all of the widely expressed concerns about the student use of AIs, it is easy to see the benefits of using them to support the learning process, and to integrate them fully into learning activities and outcomes. Indeed, it is essential that teachers and students do so, because they are not just reflections of their collective intelligence but, from now on, integral parts of it. They are not just aides to cognition but contributors to it, so they must now be part of human learning and its context. There are also solid arguments to be made that they provide educational opportunities to those who would otherwise have none, that they broaden the range of what may be taught in a single institution, that they help with the mundane aspects of being part of a machine so that teachers can focus on the softer relational human side of the process, that they can offer personal tuition at a scale that would otherwise be impossible, and that they therefore augment rather than replace human roles in a system. All of this is true today.
Here at the cusp of the AI revolution, people have grown up with and learned to operate those technologies that LLMs are now replacing, and the skills that they replace remain intact. This situation will change if it is allowed to do so. In the first place, the more soft roles that the machines take on, the less chance people will have to practice them, or even to learn them in the first place. It is important to emphasize that these are not skills like being able to sharpen a quill or to operate a slide rule, where humans are enacting hard technologies as part of another orchestration. These are the skills for which people develop such hard techniques: the creative, the situated, and the idiosyncratic techniques through which they perform the orchestration, and that are central to their identities as social beings.
Secondly, simple economics means that, if people carry on using them without making substantial changes to the rest of the educational machine, AIs will almost always be cheaper, faster, more responsive, and (notwithstanding their current tendency to confidently make things up) more reliable. In an endemically resource-hungry system, they will be used more and more and, as long as all people choose to focus on are the explicit learning outcomes, they will most likely do so more effectively than real humans. Discriminative AIs will measure such outcomes with greater speed and consistency than any human could achieve; they already can, in many fields of study.
To make things worse, current LLMs are largely trained on human-created content. As the sources increasingly come from prior LLMs, this will change. At best, the output will become more standardized and more average. At worst, the effect will be like that of photocopies of photocopies, each copy becoming less like the original. Fine-tuning by humans will limit this, at first, but those humans will themselves increasingly be products of an educational system more or less mediated by AIs. Already, there are serious concerns that the hidden guidelines and policies (which are themselves technologies) of the large organizations that train LLMs impose tacit cultural assumptions and biases that may not reflect those of consumers of their products
[50], and that may challenge or systematically suppress beliefs that are fundamental to the identities of large numbers of people
[51]. The fact that the ways this happen are inscrutable makes this all the more disturbing, especially when ownership of the systems lies in the hands of (say) partisan governments or corporations. There is much to be said for open LLMs as an antidote to such pernicious consequences.
The changes to their individual and collective cognition that result from this happening at scale will be a hard-to-predict mix of positives and negatives; the average capability to do stuff, for instance, will likely improve, though perhaps the peaks will be lower and maybe valuable skills like political reasoning may be lost
[32]. It is fairly certain, however, that such changes will occur. Unless people act now to re-evaluate what they want from their education systems, and how much of their soft cognition they wish to offload onto machines, it may be too late because the collective caoacity of the human race to understand may be diminished and/or delegated to smarter machines with non-human goals.
This entry is adapted from the peer-reviewed paper 10.3390/digital3040020