Genomic Profiling for Breast Cancer Heterogeneity Analysis

Genomic Profiling for Breast Cancer Heterogeneity Analysis: Comparison

Please note this is a comparison between Version 1 by Zijian Zhu and Version 2 by Dean Liu.

Breast cancer continues to pose a significant healthcare challenge worldwide for its inherent molecular heterogeneity.

breast cancer
heterogeneity
single-cell genome

1. Introduction

Breast cancer is one of the most prevalent cancers in women globally. In 2012, there were 464,000 diagnosed cases of breast cancer and 131,000 deaths among European women [1]. In 2020, breast cancer accounted for 2.26 million new cases globally, surpassing lung cancer (2.2 million cases) to become the most commonly diagnosed cancer worldwide (Figure 1). In the same year, the disease caused an estimated 685,000 deaths ^[1][2][3][1,2,3]. Projections suggest that by 2030, new cases will reach 3.9 million globally, with fatalities rising to 766,000 [4].

Figure 1. Worldwide estimated crude incidence rates of female breast cancer (2020) [2].

Despite considerable strides in both laboratory research and the clinical practice of breast cancer, the global incidence and mortality rates continue to rise. At the root of this persisting issue is the heterogeneous nature of breast cancer, which is not a monolithic disease, but a spectrum of distinct subtypes, each representing a unique malignancy within the breast’s cellular makeup. Current research categorizes these molecular subtypes into six major classes: (i) hormone-receptor-positive breast cancer (ER+), (ii) hormone receptor/HER2-positive breast cancer (ER+/HER2+), (iii) HER2-positive breast cancer (HER2+), (iv) basal-like breast cancer, (v) claudin-low subtype, and (vi) normal-like subtype [5]. The distinct subtypes of breast cancer each present unique clinical features and associated risk factors. This diversity extends to treatment responses and long-term patient survival, which differ significantly across the subtypes. This inherent complexity adds layers of challenge to the effective diagnosis and treatment of breast cancer. For instance, the basal-like subtype of breast cancer, characterized by high rates of cellular proliferation, is associated with distinct risk factors, which include the early onset of menstruation, a younger age at the first full-term pregnancy, and the accumulation of abdominal fat. In contrast, patients with the claudin-low subtype, marked by an enrichment of epithelial–mesenchymal transition markers, typically exhibit pronounced invasiveness. Such individuals often bear the burden of exposure to chemicals and radiation in their early years, leading to a high load of DNA damage induced by cancer genes and early chromosomal instability (CIN) [6].

The inherent heterogeneity of breast cancer presents considerable hurdles for conventional diagnostic and therapeutic approaches. Generally, traditional methods depend on analyzing bulk tumor tissue samples, a process which, by considering the average expression levels, may obscure the underlying heterogeneity, complicating accurate tumor classification. But emerging technologies such as single-cell analysis techniques offer promising alternatives, already being widely used in oncology research. These techniques, by investigating gene expression, phenotypes, protein levels, and other cellular properties at an individual cell level, are well suited to address the challenge of tumor heterogeneity (Figure 2) ^[7][8][9][7,8,9]. Particularly for highly heterogeneous cancers like breast cancer, a single-cell analysis can help to predict cellular evolution during tumor progression. The analysis of genetic and epigenetic variations as well as gene expression at the single-cell level is among the techniques that enhance the precision of predicting tumor development trends, evaluating treatment outcomes, and forecasting patient prognosis. Furthermore, single-cell analysis techniques play a crucial role in devising novel therapeutic strategies. These methods allow for the examination of genetic variations and phenotypic characteristics of tumor cells in detail, which can lead to the identification of new therapeutic targets. Consequently, this paves the way for the development of highly targeted, precise treatment strategies, enhancing the ability to predict treatment efficacy and potential drug resistance.

Figure 2. This diagram presents an overview of the structure of the article, summarizing the traditional bulk techniques and single-cell findings in genomics, transcriptomics, and proteomics. Moreover, these three groups demonstrate a progressive correlation with cancer cell behavior in the research on breast cancer heterogeneity.

Single-cell gene sequencing, leveraging the power of next-generation sequencing (NGS), has emerged as a pivotal tool for investigating breast cancer heterogeneity [10]. Unlike traditional Sanger sequencing, NGS systems use massive parallel sequencing to yield billions of DNA reads, from 36 to 150 base pairs, which can be aligned to the human genome. This alignment allows for the detection of various genetic variations, including single-nucleotide mutations, small insertions/deletions, and copy number variations, offering a comprehensive view for streamlining the development of targeted treatment strategies. A case in point is HER2-positive breast cancer, where single-cell sequencing detects the diversity in HER2 gene amplification across different cells, facilitating the formulation of bespoke treatment plans [11]. The versatile utility of NGS extends to RNA sequencing (RNA-seq), which is a high-throughput technique enabling quantitative and sequence analyses of diverse RNA types, along with their expression levels in cells and tissues. RNA-seq facilitates an in-depth exploration of gene expression regulation, signaling, and metabolic pathways pertinent to breast cancer, thereby enriching theour understanding of its molecular mechanisms. Intriguingly, beyond the recognized impact of non-coding RNAs on breast cancer progression, even the half-life of mRNA serves as an informative marker [12].

In molecular profiling, the strength of the correlation between molecular patterns and cellular behavior is pivotal, as a higher correlation implies a more accurate reflection of the tumor’s actual condition [13]. Complementing single-cell genomics and transcriptome, the emergence of single-cell proteomics offers another potent instrument for investigating breast cancer heterogeneity. This technique enables the detection and analysis of protein expression at the single-cell level. It provides a more precise view of protein expression compared to its RNA-sequencing counterpart, revealing insights into protein localization and intra-cellular interactions. For instance, immunohistochemistry provides a more direct appraisal of a patient’s tumor condition. This technique plays a pivotal role in breast cancer diagnosis, as it involves staining clinical samples of breast cancer tissues to reveal the expression of crucial proteins, including ER, PR, and HER2. The resultant immuno-stained samples offer a visual map for clinicians, aiding them in identifying the subtype of breast cancer and enabling effective pathological staging and treatment selection. Through techniques like immunohistochemistry, reswear can not only discern between these subtypes, but also detect signs of lymph node metastasis and monitor potential tumor recurrence. Such insights are pivotal for guiding treatment decisions and tracking the progression of the disease ^[14][15][16][14,15,16].

2. Genomic Profiling

2.1. Traditional Genomic Profiling

In 1994, the BRCA1 gene was identified through positional cloning, followed by the discovery of the BRCA2 gene in 1995. These genes play a crucial role in DNA damage repair and maintaining genomic stability, thereby reducing the risk of tumor development ^[17][18][17,18]. BRCA1 and BRCA2 are tumor suppressor genes involved in repairing dsDNA breaks. Mutations in these genes significantly increase the lifetime risk of developing breast cancer. Inherited mutations in BRCA1 and BRCA2 account for a small percentage of breast cancer cases. Tumors associated with BRCA1 mutations often exhibit a basal-like phenotype and a higher histological grade, while those linked to BRCA2 mutations resemble sporadic tumors more closely. Several specific scenarios can notably elevate the incidence of breast cancer: (1) sequence variants encoding premature termination codons such as nonsense or frameshift mutations occurring prior to the 1855th amino acid in BRCA1 and the 3309th amino acid in BRCA2; (2) mutations located at splice site consensus sequences—either the first or second base positioned upstream or downstream of an exon; (3) copy number loss mutations leading to frameshift mutations prior to the 1855th amino acid of BRCA1 and the 3309th amino acid of BRCA2, or mutations eliminating one or more exons not predicted or confirmed to produce functional in-frame RNA isoforms capable of restoring BRCA1/2 gene function; and (4) copy number repeat variations of any size resulting in the duplication of one or more exons, and proven to cause frameshift mutations before the 1855th amino acid of BRCA1 and the 3309th amino acid of BRCA2 ^[17][19][17,19]. Individuals harboring potential pathogenic variants in the BRCA1/2 genes can benefit significantly from timely education and early screening. According to the European Society for Medical Oncology (ESMO) guidelines [20], females identified with having mutations in BRCA1, BRCA2, or other high-penetrance genes should initiate breast cancer prevention education from the age of 18, maintain vigilant awareness of breast conditions, and comply with regular medical check-ups. Physicians advocate for annual clinical breast examinations complemented by breast X-ray imaging and magnetic resonance imaging (MRI) assessments starting from the age of 25 [21]. Despite the insights gained from early genetic knowledge, they did not immediately translate into clinical treatment strategies. The initial clinical data highlighted that BRCA-associated tumors exhibited high sensitivity to poly (ADP-ribose) polymerase (PARP) inhibitors. These inhibitors act on the PARP-mediated DNA damage repair mechanism, thereby disrupting the tumor’s ability to repair its DNA. As of now, PARP inhibitors are primarily accessible through clinical trials [22]. With the deepening understanding of breast cancer and the widespread use of NGS, more relevant genes have been discovered (Table 1).

Table 1. Breast-cancer-relevant gene, discovery year, involved process, and mutation risk.

Gene	Discovery	Involved Process	Mutation Risk	Reference
PTEN	1997	apoptosis, cell cycle, and signal transduction	activation of proliferation and survival signals	[23]
STK11	1997	cell cycle, metabolism, and energy balance	activation of cell proliferation and metabolic pathways	[24]
CHEK2	1999	DNA repair and cell apoptosis	impairments in DNA repair and cell apoptosis processes	[25]
PIK3CA	2004	regulation of signaling pathway	activation of survival signals	[26]
AKT1	2007	regulation of signaling pathway	activation of cell proliferation and survival signals	[27]
BARD1	2010	DNA repair and cell apoptosis	increased susceptibility to breast cancer	[28]
NF1	2015	regulation of signaling pathway	increased rate of developing breast cancer	[29]

There was a study involving a large cohort of breast cancer patients who underwent analysis for gene mutations and copy number variations, further substantiating the prevalence of gene alterations in this population [30]. TP53 gene mutations are commonly found in the basal-like subtype, while the HER2-positive subtype also exhibits a high incidence of TP53 gene mutations. Additionally, the HER2-positive subtype shows a significant frequency of PIK3CA gene mutations. The basal-like and HER2-positive subtypes are characterized by genomic instability and susceptibility to changes in gene copy numbers. Conducting concurrent assessments of DNA copy number and gene mutations in breast cancer cells enables the prediction of the cellular subtype. For example, it is known that the amplification of Kras2 is associated with tumor progression, while insufficient Kras2 copy numbers delays tumor progression [31]. In an investigation involving 16 human basal-like breast tumors, none displayed Kras2 mutations; however, an increased DNA copy number at the Kras2 locus was observed in 9 of the tumors. These observations imply that Kras2 amplification may modulate cell phenotypes or earmark target cell types that are susceptible in basal-like tumors [32]. Furthermore, leveraging insights from patients’ DNA sequencing results proves invaluable in tailoring subsequent treatment strategies [33]. Taking the P53 gene as an example, it plays a fundamental role as a key regulator of cellular processes, participating in controlling cell proliferation and maintaining genomic integrity and stability. Activated in response to an array of stress signals, the TP53 tumor suppressor protein curbs cell transformation by precipitating cell cycle arrest, DNA repair, and apoptosis. In breast cancer patients, however, TP53 gene mutations may precipitate a partial or complete functional loss of the TP53 tumor suppressor protein, thereby undermining its capacity to inhibit tumor development. Significantly, Asian breast cancer patients exhibit a mutation frequency of 42.9% in the P53 gene, which surpasses the mutation rate of 30% observed in Western breast cancer. This suggests a potentially higher degree of endocrine therapy resistance and lower survival rates among this demographic. Hence, breast cancer patients with inactivating mutations in the P53 gene necessitate swift intervention with appropriate follow-ups, reexaminations, and immediate treatment. For those carrying TP53 mutations, related treatment strategies can be explored in a clinical setting, which might include Gendicine therapy either as a standalone treatment or in conjunction with radiation therapy, chemotherapy, or hyperthermia, among other treatment approaches. Overall, the use of such targeted treatment for TP53 mutations has achieved a complete response rate of 30–40% and a partial response rate of 50–60% in various clinical applications and studies, with an overall response rate reaching 90–96% [34].

2.2. Single-Cell Genomic Profiling

While conventional genetic profiling can offer valuable insights, it faces significant challenges, most notably its inability to differentiate between normal and tumorous tissues. Breast tumors typically comprise a heterogeneous mix of cancerous cells, healthy tissues, stromal components, and infiltrating leukocytes. Histopathological assessments have revealed that certain samples may contain a composition of around 60% normal cells and 35% cancer cells, with a significant presence of infiltrating leukocytes [35]. The information from these additional normal tissues can be considered in the results, which may potentially overshadow crucial information and even lead to erroneous results. Additionally, traditional DNA analysis can only provide vague insights into cancer development because large-scale analyses yield average DNA profiles for tumors, making it impossible to differentiate and track each tumor cell lineage within a tumor tissue ^[36][37][36,37]. Such issues can be addressed through single-cell DNA sequencing [7]. A pivotal step in scDNA-seq involves extracting minuscule quantities of DNA from single cells, followed by whole-genome amplification (WGA). To minimize amplification-related errors and biases, PCR-based methods are commonly utilized for copy number variation (CNV) detection due to their ability to provide more uniform coverage. In contrast, for single-nucleotide variation (SNV) detection, MDA-based techniques are favored owing to their use of high-fidelity DNA polymerases that function at room temperature and display a heightened sensitivity to single-base alterations ^[38][39][38,39]. CNVs are common in a wide range of cancer cell lines, with conventional genomic analyses indicating their substantial influence in the emergence and progression of breast cancer. An assessment and identification of genomic regions bearing copy number alterations—both gains and losses—in tumor cells can facilitate the construction of lineage trees, elucidating shared ancestry among tumor subpopulations [40]. Single-cell DNA sequencing enables researchers to scrutinize CNVs and discern tumor cell subpopulations within solid tumors, even at the early stages of breast cancer development [35]. By employing a comparative analysis of CNV differences, researchers have categorized roughly 100 tumor cells into three distinct subpopulations: a diploid (D) cell subpopulation characterized by a flat morphology, a pseudodiploid (P) cell subpopulation showcasing varying degrees of deviation from diploidy, and a subpopulation embodying complex genomic rearrangements, mirroring the characteristics of a ‘late-stage’ tumor subpopulation. During the late stages of breast cancer progression, liver metastasis invariably becomes a significant concern. By employing pseudodiploid cells as reference standards, researchers have identified striking similarities between the copy number profiles of primary tumors and their metastatic counterparts. This strongly suggests that the origin of metastatic cells can be traced predominantly to late-stage amplifications, as opposed to intermediate stages or entirely divergent subpopulations [35]. Conceptually, information pertaining to SNVs within breast cancer cells can be extrapolated from pre-existing WGA data. While this methodology has proven to be adequate for detecting copy number variations, it falls short in resolving whole-genome mutations at a granular base pair resolution. A commonly used approach is to increase coverage by performing deep sequencing of these libraries. Researchers have developed a high-coverage whole-genome and exome single-cell sequencing method using high-fidelity DNA polymerase to amplify 22 chromosome-specific primer pairs. The amplified DNA is then incubated with Tn5 transposase, which fragments and ligates the DNA for sequencing adapters [41]. This technology achieves a low false-positive rate for point mutations, equivalent to 1–2 errors per million base pairs ^[42][43][42,43]. Using this technique, researchers selected invasive ductal carcinoma from estrogen-receptor-positive (ER1/PR1/HER2) breast cancer patients for bulk and single-cell sequencing. After filtering germline variations, several non-synonymous mutations were identified in the non-diploid tumor cell population, including TBX3, NOTCH2, JAK1, ARAF, NOTCH3, MAP3K4, NTRK1, AFF4, CDH6, SETBP1, AKAP9, MAP2K7, ECM2, and ECM1 [30]. Through a comprehensive analysis of the breast cancer dataset, investigators have pinpointed two key pathways that are notably disrupted during tumor evolution: the TGF-β signaling pathway and the extracellular matrix receptor signaling pathway [7]. This revelation carries substantial clinical implications, as it enables healthcare professionals to select appropriate chemotherapy drugs that specifically target these disrupted pathways. For example, TGF-β receptor I kinase inhibitors like LY2157299 can inhibit TGF-β signaling pathway transmission, slowing down tumor growth and metastasis [44]. Additionally, drugs targeting extracellular matrix receptor signaling are already being used in clinical practice, such as anti-HER2 therapy for HER2-positive breast cancer patients [7]. These findings offer new insights and approaches for breast cancer treatment. While single-cell gene profiling technology has greatly advanced theour understanding of breast cancer tumor cells, offering high precision and efficiency and resolving some standing challenges, ongoing discoveries of genetically and phenotypically unique cellular subpopulations have prompted a shift in theour perception of cancer. The cancer narrative has evolved beyond viewing it as a homogeneous aggregation of tumor cells, and researcherswe now embrace the reality of it as a spectrum of heterogeneous and evolving cancer subtypes [45]. This underlines that a sole reliance on DNA single-cell detection may fall short in addressing the research demands presented by breast cancer heterogeneity and in guiding clinical interventions directly. Therefore, it is necessary to utilize other single-cell omics detection technologies to collectively address research questions in breast cancer.