Network-Level Examination of Correspondence between Human-Brain and ANN: Comparison
Please note this is a comparison between Version 1 by Trung Quang Pham and Version 2 by Wendy Huang.

Artificial neural networks (ANNs) that are heavily inspired by the human brain now achieve human-level performance across multiple task domains. ANNs have thus drawn attention in neuroscience, raising the possibility of providing a framework for understanding the information encoded in the human brain. However, the correspondence between ANNs and the brain cannot be measured directly. They differ in outputs and substrates, neurons vastly outnumber their ANN analogs (i.e., nodes), and the key algorithm responsible for most of modern ANN training (i.e., backpropagation) is likely absent from the brain. Neuroscientists have thus taken a variety of approaches to examine the similarity between the brain and ANNs at multiple levels of their information hierarchy.

  • human brain
  • artificial neural networks
  • correspondence
  • network-level
  • sequential alignment
  • gradient

1. Introduction

Understanding the information processing of the human brain is one of the biggest challenges for neuroscientists. In recent years, the artificial neural network (ANN) has become a powerful tool, performing at or better than human levels in several domains, including image classification (AlexNet [1]), conversation (ChatGPT [2], LaMDA [3]), games (Go [4], Starcraft II [5], and biological science (i.e., protein folding [6][7][6,7]). Growing interest has thus emerged as to the degree to which the information processing of ANNs can inform that which occurs in the brain.
Studies have shown that the processing of human perceptions is hierarchically distributed over the brain ([8][9][10][11][12][13][8,9,10,11,12,13]). In the visual domain, for instance, the V2 neuron appears to be sensitive to naturalistic texture stimuli [14], V4 neurons increase selectivity for the conjunction of features representing the surface shape (i.e., non-Cartesian gratings [15]), and IT neurons show stimulus selectivity, sensitive to the specific combinations of features, i.e., the face [16]. A similar hierarchy can be found in language processing [17][18][19][17,18,19], music processing [20][21][22][20,21,22], and tactile processing [23][24][25][23,24,25]. Taking a broader perspective, converging evidence alerts us to the brain’s global hierarchical organization beyond a collection of independent sensory hierarchies. At the cellular level, Murray et al. [11] found different decay rates of a single-unit activity in early sensory areas and distant cortical regions. Whereas sensory areas may need to respond more rapidly to environmental changes to reflect faster decay rates, regions involved in more complex, integrative cognitive tasks exhibit longer decay rates, suggesting a hierarchical ordering of a measure as intrinsic as single-neuron spiking timescales. Neuroimaging evidence of a global, sensorimotor-to-transmodal gradient supported this hierarchy of temporal dynamics, as well as other converging evidence such as increasing intracortical myelination, functional connectivity, and semantic processing along the gradient [26][27][26,27].
Given the intrinsic hierarchical architecture of ANNs, it becomes natural to wonder if they can capture the information processing that occurs in the human brain, thus serving as a framework for its understanding [28][29][28,29]. While both the human brain and the ANN are “black-boxes”, the latter is easier to customize and analyze. ANNs may provide a useful model for understanding the former, akin to how the atomic model can usefully convey the interaction between protons and electrons. As statistician George Box once said, “All the models are wrong, but some are useful”.
Relationships between the human brain and modern ANNs have been found since the early stages of ANN development. Studies have revealed the similarity between cognitive processing, such as vision and audition, and the hidden layers of ANNs [28][30][31][32][33][34][35][36][28,30,31,32,33,34,35,36]. The similarity was not limited to well-known “supervised” learning, but also “unsupervised” learning and “self-supervised” learning [37][38][37,38]. The growing number of studies in this area offers promise toward improving our understanding of the brain, as ANNs rapidly grow in sophistication and performance across problem domains. However, comparing ANNs to the brain to arrive at meaningful references is not a straightforward process. ANNs are inspired by the brain but are not replicas. Not only do they differ in substrate, but there are also vastly fewer ANN nodes than neurons. The principal algorithm that discovers the hierarchical circuitry of most modern ANNs is unlikely to exist in the brain [39].
Conventionally, evaluation of the similarity between the ANN and the human brain has been based on their performance in “intelligent” tasks (e.g., object detection, object classification, text generation, image generation, game playing, etc.). However, just this high-level comparison is inadequate for determining whether the ANN under the hood is undergoing comparable information processing to the brain. Neuroscientists have thus taken a variety of indirect approaches to evaluate the correspondence between ANNs and the brain.
Network-level correspondence examines the overall information flow inside an ANN to a comparable network in the brain, such as the hierarchical representations across a single modality, or the multimodal integrative network across the whole brain. In relation to the layer-level correspondence, a straightforward approach is to quantify the alignment between the sequence of ANN layers and sequential processing expected in the brain. For example, one can compute the correlation between the two and count the nodes of layer that are most associated with each ROI in order to test if there is a shifting of distribution from low-level to high-level cortices. Given the intrinsic feed-forward characteristics of the ANN, a sequential alignment between the brain and the ANN would indicate a hierarchical network-level correspondence.

2. Sequential Alignment Approach

Early examinations of network-level correspondence have been conducted for sensory networks (visual network, auditory network) due to their interpretability [40][59]. For the visual network, the distribution of a model-explained variance of neural activity from Yamins et al. [30] shows a clear shift from V1 to IT as the layers changed from the first to the top layer. A similar correspondence across layers was found for information extracted along the visual ventral pathway [41][60] as well as the dorsal pathway [42][61]. A recent study from Mineault et al. [43][62] confirmed a similar correspondence between the ANN and the visual dorsal pathway in non-human primates. Using the ANN decoding approach, Horikawa and Kamitani showed that dreaming recruits visual feature representations that correlated hierarchically across the visual system [33]. For the auditory network, Kell et al. [36] found that an ANN trained on speech and music correlated with the auditory processing hierarchy in the brain with different layers processing different aspects of sound. In another study using ANNs trained to classify music genres, Guclu et al. [44][63] showed a representational gradient along the superior temporal gyrus, where anterior regions were associated with shallower layers and posterior regions with deeper layers. Evaluating large-scale networks, such as across modalities or the global brain hierarchy, poses an additional problem. For instance, the actual hierarchical correspondence between the human auditory system and visual ANNs remains unclear, as other studies have raised the suggestion of parallel organization [45][64]. Spatial locations like brain coordinates may provide an intuitive correspondence but not concrete evidence of the brain’s structural–functional organization. For instance, not all of the many functional networks of the brain may adhere to a clear posterior-to-anterior hierarchy.

3. Gradient-Based Approach

Brain–ANN correspondence at the network level should also account for the sequence of chosen ROIs and the design of the ANN, such as the features that each node processes, whether they are processed sequentially or in parallel, how multiple modalities are integrated, and so on. A promising approach here is to use the principal gradient (PG) [26] as a reference. The PG is a global axis of brain organization that accounts for the highest variability in human resting-state functional connectivity. Its arrangement begins with multiple satellites of unimodal sensory information that converge transmodally and integrate with the default mode network (DMN). A meta-analysis using the NeuroSynth database [46][65] has reinforced the relationship between cognitive function and position along the PG, with sensory perception and motion exhibiting lower positions, and higher-order, abstract processes such as emotion and social recognition exhibiting higher positions [26]. The implications of PG on the hierarchical organization of functionality are further supported by clinical evidence, such as the compression of the principal motor-to-supramodal gradient in patients with schizophrenia (96 patients with schizophrenia vs. 120 healthy controls) [47][66] and the decrease in PG values in a neurodegenerative condition like Alzheimer’s disease [48][67].
For evaluating correspondence at the global brain level, the PG provides an independent, quantifiable metric of its hierarchy, anywhere from sensorimotor and transmodal to higher cognitive and affective information processing. Examining how subjective value emerges in the brain, ANNs individually trained to output subjective value from visual input have been shown to hierarchically correspond to the PG in the brains of those same individuals experiencing a similar value during fMRI [49][68], whereas Nonaka et al. [31] showed that most ANNs tend to have similar representations to the lower portion of the higher visual cortex (divided by the PG), but not the middle and higher ones, suggesting that findings of detailed correspondence in local areas could be more complex than simply an adherence to a global hierarchy.
ScholarVision Creations