PET-Based Criteria for Response Assessment in Lymphoma

PET-Based Criteria for Response Assessment in Lymphoma: Comparison

Please note this is a comparison between Version 1 by Andrea Ciarmiello and Version 2 by Camila Xu.

Criteria for response assessment in lymphoma have deeply evolved in the last decade, assigning an outstanding role to FDG-PET/CT. This path starts from the first lymphoma specific CT-based criteria and leads towards the PET-based Lugano Classification, that, nowadays, represents the gold standard. LYRIC (Lymphoma Response to Immunomodulatory Therapy Criteria) criteria, the recent refinement of the Lugano Classification, were conceived to capture new patterns of response (delayed response and pseudoprogression) observed during treatment with novel immunotherapy agents that have entered the clinic.

immunotherapy
Hodgkin lymphoma
FDG-PET
pseudoprogression
Lugano Classification
LYRIC

1. Introduction

In the last decade, novel biological agents with an immune mechanism have entered the clinical world; the newest agents are immune checkpoint inhibitors. Nowadays, immune checkpoint inhibitors represent the standard of care for advanced melanoma, non-small-cell lung cancer, renal carcinoma and head and neck tumors ^[1][2][3][1,2,3]. In the last decade, the impressive results of phase I and II studies exploring the effectiveness and safety of PD-1 inhibitors in Hodgkin lymphoma (HL) ^[4][5][4,5] and primary mediastinal B cell lymphoma (PMBCL) [6] granted the accelerated approval of anti-PD-1 by the FDA without a confirmatory phase III study. In 2016, nivolumab was approved by the FDA for the treatment of relapsed/refractory classical HL (cHL) after autologous stem cell transplantation and brentuximab vedotin, as the first hematologic indication. Pembrolizumab was approved for relapsed/refractory cHL after at least three lines of therapy in 2017 and for relapsed PMBCL after the failure of two or more lines of therapy in 2018 (Keynote 013 study).

The impact of immune checkpoint inhibitors on the treatment of HL is related to the unique property of HL of being constituted only by a minority of malignant cells (Reed–Stemberg cells) embedded in an abundant microenvironment, whose cells overexpress PD1-PDL1 due to a genetic aberration in the 9p23-24 locus. Immune checkpoint inhibitors are of minor importance in non-Hodgkin lymphoma (NHL); no immune checkpoint inhibitor approval exists for NHL. However, for relapsed/refractory NHL, the option of chimeric antigen receptor T (CAR-T) cell therapy is gaining ground. CAR-T therapy was recently approved by the FDA and EMA for the treatment of relapsed/refractory diffuse large B cell lymphoma.

Immune checkpoint inhibitors, working with an immune mechanism, may cause a transient increase in tumor burden due to inflammation, named pseudoprogression, and they may alter tumor metabolism, yielding false positive and false negative results on FDG-PET/TC. In recent years, novel response criteria were designed in an attempt to capture these additional response patterns beyond those observed in conventional chemotherapy.

2. Immunobiology of Immune Checkpoints

Tumor cell growth is promoted by the ability of tumor cells to “escape” from the immune system and to be immunotolerant. Tumor cells lose their immunogenic antigens and manipulate the microenvironment dysregulating immune checkpoints to express inhibitory signals ^[7][8][9][7,8,9]. The rationale of immunotherapy is to restore a florid T-cell cytotoxic response directed against the tumor, and this can be achieved either by activating stimulatory checkpoints or by inhibiting inhibitory checkpoints [10]. The most relevant inhibitory checkpoints are programmed death cell receptor 1 (PD1) and cytotoxic T-lymphocyte-associated protein 4 (CTLA-4), both being receptors expressed on the T-cell surface inducing T-cell anergy. PD1, through the interaction with its ligand, programmed death cell ligand 1 (PDL1), expressed in antigen-presenting cells (APCs), activated T cells and tumor cells, inhibits the T-cell cytotoxic response [11]. CTLA-4 inhibits T-cell proliferation by blocking the costimulatory molecules of the B7-CD28 superfamily expressed on APCs [12]. The knowledge about the expression of immune inhibitory checkpoints in hematologic malignancies has been illustrated in a recent review work by Witkowska and Smolewsky [13]. HL widely overexpresses PD1/PDL1 due to a widespread genetic alteration in the locus 9p23-24 and the subsequent activation of Janus kinase 2 [14]. PMBCL shows a high expression of PD1 ligands, especially the EBV-positive subtype, probably mediated by virus latent proteins ^[15][16][15,16]. Follicular lymphoma (FL), originating from B germinal centers similarly to HL and PMBCL, may express PD1 ligands [17]. CTLA-4 expression, of which little is known about, might be observed in T-cell lymphomas and Sezary syndrome [18].

3. Review of PET-Based Criteria for Response Assessment

3.1. Background and Assessment of Response to Conventional Chemotherapy

The first effort towards the standardization of assessments of the response to cancer treatment was a handbook published in 1979 promoted by the World Health Organization (WHO) [19]. The WHO criteria stated the concept of the bidimensional measurement of tumor burden as a sum of the products of lesionsk diameters before and after therapy and established the four response categories still currently in use: complete response, partial response, stable disease and progressive disease. The first guidelines to incorporate the metabolic data provided by FDG-PET/CT in response assessment were the European Organization for Research and Treatment of Cancer (EORTC) criteria, released in 1999 [20]. The reference region for complete metabolic response was the background adjacent to lesions. The main goal of the EORTC criteria was to evaluate the viability of residual masses: based on metabolic activity, it was feasible to discriminate fibrotic/necrotic changes from residual tumors. A new set of joint EORTC/National Cancer Institute CT-based guidelines for response assessment, the Response Evaluation Criteria in Solid Tumors (RECIST), was first published in 2000 [21] and then revised and updated in 2009 (RECIST 1.1) [22]. In contrast to the bidimensional assessment of the WHO criteria, being laborious and time consuming, the RECIST criteria rely on a unidimensional assessment of the largest axial diameters of the tumors [23]. Moreover, RECIST introduced the concept of target lesions i.e. mesurable (minimum size 10 mm), well defined lesions illustrative of all the organs involved. In the same year, 2009, on the heels of RECIST, Wahl et al. published the PET Response Criteria in Solid Tumors (PERCIST) [24]. Similarly to RECIST, the PERCIST criteria rely on the assessment of residual metabolic activity in target lesions, defined as the hottest lesions. The remarkable innovations of PERCIST are the introduction of SUV lean (SUL, SUV normalized for lean body mass) and SUL peak and the definition of the minimum measurable activity as 1.5 times hepatic activity. Due to the peculiarity of hematologic malignancies, specialized criteria for response assessment in lymphomas were developed. The first effort to design response criteria specific for lymphomas was the International Working Group (IWG) criteria [25], sponsored by the National Institute of Health, published in 1998. The IWG criteria were CT-based criteria, and they introduced a fifth response category, namely, complete response/unconfirmed (CRu), defined as the persistence of residual nodal masses despite a reduction greater than 75% in the sum of the product of diameters. CRu reflects the difficulty of assessing the origin of residual masses based purely on radiological data. In the early 2000s as the fast growth of PET began and as PET/CT tomographs were developed, the gain in accuracy provided by PET, able to assess the viability of residual masses, was recognized, leading to the proposal of the so-called IWG+PET criteria by Juweid et al. in 2005 [26]. Soon after in 2007, in the context of a project promoted by the German Study Group, the International Harmonization Project, two publications by Cheson et al. [27] and by Juweid et al. [28] updated the IWG criteria, embodying PET in the response evaluation. These modified criteria were based on an integrated evaluation of CT and PET. The PET evaluation was qualitative and provided a positive or negative classification based on a comparison of activity in residuals with activity in reference regions (mediastinal blood pool for residual masses greater than 2 cm and adjacent background for smaller lesions). The assessment of viability in residual tumors enabled by PET/TC led to the elimination of the ambiguous CRu category. In 2009, an International workshop held in Deauville (France) formulated novel response criteria, the Deauville Score (DS) ^[29][30][29,30]. DS is a five-point scale based on a visual comparison of activity in residual tumors with activity in reference regions (mediastinal blood pool and liver). In 2013, at the 12th International Conference on Malignant Lymphomas, the Lugano Classification was developed [31], a body of consensus recommendations for staging and response assessment in lymphomas. According to the Lugano guidelines, both contrast CT and PET have to be performed in the setting of response assessment. Separate sets of response criteria for CT and PET evaluations were published. For PET interpretation, DS was adopted. DS, being simple and easy to implement, had widespread diffusion and underwent a process of standardization across centers, becoming the gold standard for response assessment in lymphomas. DS implementation found fertile ground in the emerging trend of interim response assessment, allowing early treatment adaptation. PET positivity (DS 4, 5) and PET negativity (DS 1-3) showed significant prognostic value and interim-PET status offered the opportunity to guide early treatment changes. Following an active research from the early 2000s, interim-PET became the standard of care in HL. Being purely qualitative, DS has some limitation and may be prone to optical misinterpretation and inter-reader variability. In the case of uncertainty of DS attribution, research groups active in the field recommend confirming visual evaluations with the SUV ratio between residual tumors and reference regions [32]. Recently, quantitative extensions of DS were also developed, particularly qPET ^[33][34][33,34], but these methods have not yet been prospectively validated and need standardization. The evolution of the response criteria in oncology and hematology over time is presented in Figure 1.

Figure 1. Evolution of criteria for response to cancer treatment. Timeline illustrating the evolution of response criteria over time in oncology and hematology, outlining the differences in method of tumor measurement, PET interpretation and assessment of progression of disease. SPD: sum of products of diameters. SLD: sum of longest axial diameters. MBP: mediastinal blood pool. SUL: standardized uptake lean mass. DS: Deauville Score. Cru: unconfirmed complete response. irPD: immune-related progression of disease. IR: indeterminate response. iUPD: immune-unconfirmed progression of disease.

3.2. Pseudoprogression and Hyperprogression

The Lugano Classification was designed to assess the response to traditional chemotherapy or conventional chemo-immunotherapeutic regimens, including rituximab. The patterns of response to immunotherapy differ from the patterns observed in conventional treatments. Usually, response occurs early after immunotherapy, and, consequently, an early response evaluation after two–three cycles of therapy is advisable. Response assessment may be confounded by the phenomena of delayed response and flare/pseudoprogression. Delayed response consists of a late objective response in the course of treatment, after initial tumor growth and apparent progression of the disease. Flare/pseudoprogression was first described in lymphomas and chronic lymphocytic leukemia receiving lenalinomide as a rapid increase in the size of lymph nodes, often painful, accompanied by fever and lymphocytosis ^[35][36][37][35,36,37]. Flare/pseudoprogression is defined as an increase in the size of baseline lesions and even the appearance of new lesions when the patient is clinically improving. It represents an apparent progression on imaging, in the absence of clinical deterioration of the patient, and it is followed by a response. Pseudoprogression usually occurs early during treatment. The increase in the size of baseline lesions is an inflammatory phenomenon due to T-cell recruitment, NK activation and a massive release of cytokines [38]. It is crucial to recognize pseudoprogression and to not discontinue treatment before achieving clinical benefit. Hyperprogression, defined as a rapid acceleration of tumor growth, is a new aggressive pattern reported in a fraction of lung cancer, melanoma, renal carcinoma [39] and head and neck carcinoma [40] cases treated with anti-PD-1/PD-L1. Compared to pseudoprogression described above, hyperprogression is a disruptive phenomenon, and it is not prone to uncertainty in interpretation.

3.3. Assessment of Response to Immunotherapy

Atypical responses encountered in patients under immune checkpoint blockade, due to delayed responses and pseudoprogression, and additional response patterns beyond those of conventional chemotherapy classified by the WHO and RECIST criteria were shown to be associated with survival benefit comparable to typical responses [41] and needed to be taken into account in response assessment. There have been efforts to characterize these phenomena and to incorporate them into novel response criteria. In 2009, a publication by Wholchok et al. proposed the Immune-Related Response Criteria (IRC) [41], novel CT-based immune therapy response criteria adapted from the WHO criteria, based on the experience of community workshops using data from patients with advanced melanoma treated with ipilimumab. Across this cohort of patients, four patterns of response to ipilimumab were reported. Two patterns were captured by conventional response criteria: (1) a shrinkage in baseline lesions without new lesions and (2) “stable” disease, eventually followed by a slow steady decline of tumor burden (TB). The other two were new and were beyond conventional response assessment: (3) response after an initial increase in TB and (4) a reduction in overall TB concomitantly with the appearance of new lesions. The main statements of the IRC can be resumed as follows:

Immune-related progression (irPD) of disease needs to be confirmed in a subsequent scan at least 4 weeks later in the absence of clinical deterioration and the worsening of laboratory parameters, aimed to uncover delayed response and pseudoprogression phenomena. PD confirmation is required before withdrawing treatment.
Stable disease is considered a therapeutic effect and a surrogate end-point for clinical benefit in contrast to assessment of the response to conventional cytotoxic therapy.
Total TB has to be measured, even when new lesions appear. The threshold for SD and partial response (PR) are the same as that in the WHO criteria, but new lesions are included in TB assessment.

The IRC have been implemented into clinical trials evaluating immune checkpoint inhibitors in solid tumors. In 2013, the IRC were adapted to the unidimensional RECIST criteria and called Immune-Related RECIST (irRECIST) [42]. In 2017, the RECIST working group adapted the RECIST 1.1 criteria to the new body of knowledge about the patterns of response to immunotherapy in solid tumors and developed the so-called Immune-RECIST (i-RECIST) [43]. i-RECIST have a new response category of “immune unconfirmed progression” that requires confirmation on a subsequent scan within 6–8 weeks, accounting for the occurrence of pseudoprogression and delayed response. In the studies on the immune checkpoint blockade in LH and LNH, a similar incidence of delayed response and flare/pseudoprogression, and response patterns similar to those reported in solid tumors have been observed. However, merely translating the IRC in the setting of response assessment in lymphomas was not considered totally appropriate for the following reasons: first, over time, there was an independent evolution of the response criteria for solid tumors and lymphomas. Response in solid tumors is assessed using morphologic unidimensional criteria, the RECIST criteria, whereas response in lymphomas is evaluated using the Lugano Classification based on PET/TC and on a bidimensional assessment of lymph node size on CT. Second, progression is defined by the WHO criteria as an increase in size >25% of the sum of the product of the diameters of solid tumors, whereas in lymphomas, an increase in the size of a single lymph node accompanied by PET positivity is adequate to discern progression. Third, response assessment in solid tumors is based on a dimensional evaluation of masses, always considered abnormal, whereas in the setting of lymphomas, residual masses do not have just an interpretation, since they can represent fibrotic/necrotic changes, if metabolic activity is low. To address these issues, in 2016, the LYRIC criteria (Lymphoma Response to Immunomodulatory Therapy Criteria) [44] were developed as a refinement of the Lugano Classification accounting for features specific of immunotherapy. In the LYRIC criteria, a CT-based size assessment and a PET/TC evaluation are integrated together. LYRIC introduced the novel category of indeterminate response (IR) to account for flare/pseudoprogression and delayed response, requiring a confirmatory study, either a biopsy or subsequent imaging within 12 weeks. Three types of IR were identified:

IR(1): Progression (defined as >50% increase in overall TB) in the first 12 weeks of therapy without clinical deterioration.

Figure 2. IR (2): Pseudoprogression in a patient on nivolumab for Hodgkin lymphoma. Panel (A) shows baseline disease. Panel (B) (II–III) shows the appearance of new nodal lesions (red arrows) in early PET evaluation after four cycles of immunotherapy. PET/TC evaluation at a later time point (C) demonstrates regression of the nodal flares and metabolic response.

Figure 3. IR (3): Panel (A) shows baseline lesions (red arrow). Early PET evaluation (B) during nivolumab for Hodgkin lymphoma shows increase in FDG uptake (red arrow) in baseline lesions without concomitant increase in size. At subsequent PET evaluation (C), there is a concordant increment in size (red arrow), and criteria for true progression are met.

The LYRIC criteria were applied in studies assessing the response to immunotherapy in lymphomas and were compared with the Lugano Classification. In 2017, with the aim of unifying the response criteria in lymphoma with the response criteria in solid tumors in the context of clinical trials evaluating new therapeutic agents in a mixed population of patients with lymphoma and patients with solid tumors, an international working group developed the Response Evaluation Criteria in Lymphoma (RECIL) [45]. RECIL looks at the RECIST criteria, proposing a unidimensional evaluation of the sum of the longest axial diameters in a maximum of three target lesions, instead of the sum of the product of diameters in up to six target lesions as suggested by the Lugano criteria. Based on the hypothesis that new therapeutic agents can alter a tumor’s metabolism and, thus, have the potential to increase false-positive and false-negative FDG-PET results, RECIL decreased the role of PET in response assessment in lymphomas. In the Lugano Classification, complete response (CR) was represented by PET negativity (DS 1–3) regardless of lesion size, whereas in RECIL, the CR response category requires a shrinkage >30% of lesions besides PET negativity. The PR category was also modified to capture the mixed responses encountered with novel treatments. In the Lugano Classification, the increase in size >50% of a single lesion is sufficient to discern PD, even if other lesions concomitantly decrease in size. In contrast, in RECIL, similarly to the IRC and LYRIC seen above, the overall tumor burden is considered, and this case may discern PR, defined as a decrease in size >30% of overall TB accompanied by PET positivity (DS 4 or 5). Moreover RECIL introduced a novel provisional category of minor response, defined as a shrinkage of lesions >10% and <29% accompanied by any PET status, aiming to account for a response that does not fulfill the criteria for traditional response categories but may be associated with survival benefit. A comparison of the Lugano Classification, LYRIC and RECIL 2017 is presented in Table 1.

Table 1.

Comparison between Lugano lymphoma classification, LYRIC and RECIL 2017.

	Lugano	LYRIC	RECIL
Number of target lesions	Up to 6	Up to 6	Up to 3
Measurement method	Bidimensional: perpendicular diameters	Bidimensional: perpendicular diameters	Unidimensional: long diameter of any target lesion
Complete response (CR)	PET negativity (DS 1–3) with or without a residual mass	Same as Lugano	PET negativity (DS 1–3) plus reduction in SLD > 30%
Minor response (MR)	No	Same as Lugano	Yes: reduction in SLD between ≥10% and <30%
Partial response (PR)	Reduced FDG-PET uptake (DS 4–5) Decrease SPD ≥ 50%	Same as Lugano	Reduction in SLD ≥ 30% not meeting criteria for CR New lesions are included in overall TB
Stable disease (SD)	Stable FDG-PET uptake (DS 4–5) Decrease SPD < 50%	Same as Lugano	Decrease <10% to increase ≤20% in SLD
Progression of disease (PD)	Increased FDG-PET uptake (DS 4–5) Increase SPD ≥ 50% New lesions	As with Lugano, with the exception of IR IR(1): ≥50% increase in SPD in first 12 weeks IR(2): <50% increase in SPD with new lesion(s) IR(3): increase in FDG uptake without a concomitant increase in lesion size	Increase in SLD by 20%. For relapse from CR, at least one lesion should measure 2 cm in the long axis with or without PET activity

IR(2): Appearance of new lesions in the context of overall TB stability (Figure 2).
IR(3): Increase in uptake of existing lesions without a concomitant increase in lesion size and number (Figure 3).

SPD: sum of product of perpendicular diameters of target lesions. SLD: sum of the longest diameters of target lesions. IR: indeterminate response.

For an assessment of the response to immunotherapy in lymphomas, FDG-PET should be performed at baseline and repeated after three–four cycles (at 9–12 weeks). Immune checkpoint inhibitors induce inflammation that can translate into increased FDG uptake and even into the appearance of new lesions in the absence of true progression. In the assessment of patients with lymphoma during the course of immunotherapy, collaboration between clinicians, radiologists and PET readers in the context of a multidisciplinary approach is advisable, especially in equivocal and challenging cases to discriminate treatment-induced inflammation/pseudoprogression from true progression. Decisions must be based on a repeated scan taken 12 weeks later. A re-biopsy, when feasible, might be necessary in cases of persistent FDG uptake, and it is encouraged in cases with the appearance of new lesions of indeterminate origin. A possible algorithmic approach to patients with HL on immunotherapy will be illustrated in Figure 4.

Figure 4.

Flowchart of assessment of response to immunotherapy in lymphoma.