Addition of docetaxel to hormonal therapy in low- and high-burden metastatic hormone sensitive prostate cancer: long-term survival results from the STAMPEDE trial

Abstract Background STAMPEDE has previously reported that the use of upfront docetaxel improved overall survival (OS) for metastatic hormone naïve prostate cancer patients starting long-term androgen deprivation therapy. We report on long-term outcomes stratified by metastatic burden for M1 patients. Methods We randomly allocated patients in 2 : 1 ratio to standard-of-care (SOC; control group) or SOC + docetaxel. Metastatic disease burden was categorised using retrospectively-collected baseline staging scans where available. Analysis used Cox regression models, adjusted for stratification factors, with emphasis on restricted mean survival time where hazards were non-proportional. Results Between 05 October 2005 and 31 March 2013, 1086 M1 patients were randomised to receive SOC (n = 724) or SOC + docetaxel (n = 362). Metastatic burden was assessable for 830/1086 (76%) patients; 362 (44%) had low and 468 (56%) high metastatic burden. Median follow-up was 78.2 months. There were 494 deaths on SOC (41% more than the previous report). There was good evidence of benefit of docetaxel over SOC on OS (HR = 0.81, 95% CI 0.69–0.95, P = 0.009) with no evidence of heterogeneity of docetaxel effect between metastatic burden sub-groups (interaction P = 0.827). Analysis of other outcomes found evidence of benefit for docetaxel over SOC in failure-free survival (HR = 0.66, 95% CI 0.57–0.76, P < 0.001) and progression-free survival (HR = 0.69, 95% CI 0.59–0.81, P < 0.001) with no evidence of heterogeneity of docetaxel effect between metastatic burden sub-groups (interaction P > 0.5 in each case). There was no evidence that docetaxel resulted in late toxicity compared with SOC: after 1 year, G3-5 toxicity was reported for 28% SOC and 27% docetaxel (in patients still on follow-up at 1 year without prior progression). Conclusions The clinically significant benefit in survival for upfront docetaxel persists at longer follow-up, with no evidence that benefit differed by metastatic burden. We advocate that upfront docetaxel is considered for metastatic hormone naïve prostate cancer patients regardless of metastatic burden.


Introduction
The primary analysis of STAMPEDE's 'docetaxel comparison', reporting an improvement in survival, was triggered by reaching a pre-specified number of control group deaths [1]. The trial team agreed to update this analysis when there was a meaningful increase in the number of primary outcome measure events after further follow-up, expected to occur $3 years later. During that time, the Intermediate Clinical Endpoints in Cancer of the Prostate surrogacy work showed that measures based on metastatic progression could be used as a surrogate for survival in patients presenting with M0 disease, allowing trials in that setting to achieve increased power sooner [2]. Given the prognosis for metastatic and non-metastatic patients is now very different and that other trials of first-line docetaxel have kept metastatic and non-metastatic patients separate, the STAMPEDE team agreed that the long-term follow-up results would be analysed separately for these two groups of patients.
Since that initial STAMPEDE report, first-line systemic combination treatment options given with ADT in metastatic hormone naïve prostate cancer (mHNPC) have expanded to include abiraterone, enzalutamide and apalutamide as well as docetaxel [1,[3][4][5][6][7][8]. However there is still controversy about patient stratification and selection for treatment. Metastatic burden sub-group analyses of the CHAARTED and GETUG-15 trials have led some to conclude that docetaxel should not be given as a first-line treatment of patients presenting with 'low metastatic burden' disease [3,[9][10][11]. This represents $40% of patients presenting with de novo mHNPC [12,13]. These were retrospective analyses of relatively small subgroups of these trials and a number of groups have not been persuaded by these exploratory retrospective analyses. Reflecting this uncertainty, major treatment guidelines offer conflicting advice about whether all metastatic patients should receive combination treatment or whether this should be restricted only to those with 'high-burden', as specified in the CHAARTED trial [11,[14][15][16].
To address the hypothesis raised by CHAARTED, bone scans from the M1 docetaxel comparison cohort were collected retrospectively to determine the metastatic burden for STAMPEDE patients (independently of treatment assignment and outcome) and to undertake a stratified sub-group analysis. Outcome for the sub-groups categorised by individual metastatic burden was then determined using the extended patient follow-up now available in the STAMPEDE M1 cohort.

Study design
The multi-arm, multi-stage STAMPEDE trial enrols patients with advanced high-risk or metastatic prostate cancer. Between 05 October 2005 and 31 March 2013, men with newly diagnosed metastatic prostate cancer were randomised on a 2 : 1 basis either to a standard-of-care (SOC) control group ('Control') or SOC þ docetaxel treatment group ('docetaxel') [1]. SOC in M1 patients comprised long-term androgen deprivation therapy (ADT), the intervention being lifelong ADT and six cycles of docetaxel at standard dose. Randomisation followed a minimisation algorithm with a random element of 20%, stratified for hospital site, age at randomisation (<70 years old versus !70 years old at randomisation), WHO performance status (a score of 0 versus 1/2), nodal involvement (negative, positive or unspecified), planned ADT and usage (yes/no) of aspirin or other non-steroidal anti-inflammatory drugs (NSAIDs). The randomisation algorithm was developed and maintained centrally at the MRC Clinical Trials Unit at UCL. The trial was carried out in accordance with Good Clinical Practice guidelines, with full regulatory and ethical approval.
The first efficacy results for docetaxel within the STAMPEDE [1] used data from the initial randomisation in October 2005 to the data freeze with follow-up to March 2015. The date of this analysis was predetermined by accumulation of events in the control group. In this paper, we report on a pre-planned long-term efficacy analyses of the M1 patient cohort but with updated results using extended follow-up data to July 2018.

Procedures
All procedures relating to administration and reporting of docetaxel as a research treatment have been reported previously [1]. In brief, patients were randomised to lifelong ADT with or without six cycles of docetaxel (75 mg/m 2 ) given 3-weekly with prednisolone 5 mg twice daily during the 18-week period of therapy.
The follow-up schedule, defined in the trial protocol, involved followup visits every 6 weeks until 6 months after randomisation, then 12 weeks to 2 years, 6 months to 5 years and annually thereafter. Toxicity was reported routinely at follow-up visits. Adverse event classification and grades followed the National Cancer Institute Common Terminology Criteria for Adverse Events (CTCAE) version 4.0. Since the primary report for this comparison, data on baseline metastatic burden has been retrospectively collected where possible for UK patients blinded to treatment outcomes. Metastatic burden was assessed using whole-body scintigraphy and computed tomography or magnetic resonance imaging staging scans, categorised following the definition used in the CHAARTED trial [3], where high-burden patients had either four or more bone metastases including one or more outside the vertebral body or pelvis, or any visceral metastases, or both. All other patients metastatic at baseline were categorised as having low metastatic burden according to this definition.

Outcomes
Overall survival was specified as the primary efficacy outcome and defined as time from randomisation to death from any cause. Secondary outcomes for long-term efficacy data included: failure-free survival (FFS; time from randomisation to the first of any: biochemical, lymph node, distant metastatic progression or prostate cancer death); progressionfree survival (PFS; time from randomisation to the first FFS event, not including biochemical progression); metastatic progression-free survival (mPFS; time from randomisation to either new metastases, progression of existing metastases or prostate cancer death) and prostate cancerspecific survival (PCSS; time from randomisation to prostate cancer death). Biochemical progression was assessed using prostate-specific antigen (PSA) measurements which were reported at each follow-up visit. The lowest PSA value within the first 24 weeks after enrolment was used to define a nadir, from which an increase of 50% and to a minimum level of 4 ng/ml indicated biochemical progression. For a small proportion of patients, PSA levels did not fall after enrolment and a nadir could not be estimated; these patients were considered to have biochemical progression at the date of randomisation. Patients without any event of interest reported by the time of the data freeze were censored in the analyses at the time they were last recorded in the trial as without an event.
The cause of death was assigned as prostate cancer/not prostate cancer where possible, using a set of rules pre-specified by the Trial Management Group's End point Review Committee. Any deaths not meeting the prespecified rules were individually reviewed by a trial clinician to assign a cause of death independent of the allocated treatment.

Statistical analysis
Sample size calculations for trial design used the nstage function in Stata, based on a target HR of 0.75 for the primary efficacy analysis. Long-term outcome analysis was planned for $3 years after the first analysis, when there was projected to be a 40% increase in control group deaths. For the long-term analysis by metastatic disease burden, there was 66% and 77% power to detect a hazard ratio of 0.75 in the low and high-burden sub-groups, respectively. This calculation used the observed accrual and previous event rate in the control group (without reference to any accumulating differences between arms/sub-groups).
Efficacy analyses, following intention-to-treat principles, included all patients allocated to a trial arm. Toxicity and adverse event data are presented with patients grouped according to whether or not docetaxel treatment was reported as having been started (29 patients randomised to docetaxel who did not receive the drug are reported as Controls for the purposes of comparing toxicity between the treatments). See Figure 1 for full details of patients included in the analyses.
All time-to-event analyses followed standard survival analysis methods using Stata (version 15.1). Median duration of follow-up period was estimated using the Kaplan-Meier method with reverse-censoring of any reported deaths. Hazard ratios were estimated from Cox proportional hazards regression models, adjusted for minimisation factors used in the randomisation algorithm (nodal stage, age at randomisation, WHO performance score, use of aspirin/NSAIDs, planned use of SOC radiotherapy) and stratified by time period, as defined according to the other arms open to accrual in STAMPEDE (i.e. changes through closure or opening of other trial arms on the platform) or SOC practice changes. Nonparametric stratified log-rank tests were used to test for differences between the control and docetaxel groups, with stratification the same as that of the Cox regression models. Flexible parametric models were used to generate 5-year survival estimates, fitted using (5,5) degrees of freedom with adjustment variables as specified above. PCSS was analysed using Fine and Grey regression methods for competing risks analysis [17]. For all outcomes, models were tested for evidence of non-proportional hazards, and where found, interpretation of the results emphasises restricted mean survival time (RMST), which was calculated using a t* of 120 months, estimated as described previously [18]. For all statistical tests, two-sided tests were used and 95% confidence intervals and P-values are reported. Sub-group analyses are presented for all outcomes for metastatic burden sub-groups (low-and high-burden sub-groups). Although our emphasis is on metastatic burden, exploratory sub-group analyses are also presented for the primary outcome in order to give detail on the consistency of treatment effect across baseline factors of potential interest for this patient population: nodal status (N0, Nþ or NX), Gleason sum score ( 7, 8-10 or unknown), patient age (<70 or !70) and WHO performance score (0 or 1-2).

Results
One thousand and eighty-six metastatic patients (724 control/ 362 docetaxel) were recruited to STAMPEDE's 'docetaxel comparison' between 05 October 2005 and 31 March 2013. The dataset for this analysis was frozen on 13 July 2018. As reported previously, baseline patient characteristics were well-balanced across trial arms (Table 1). We additionally report sub-group analyses according to metastatic burden, which was assessed using bone scans available from 830/1086 (76%) of all recruited patients. The patients included in these sub-group analyses were well-balanced across arms (Table 2 and supplementary Table S1, available at Annals of Oncology online). In addition, a comparison of Tables 1 and 2 demonstrates that the subset of patients included in the metastatic burden sub-group analyses was representative of the metastatic patients in the comparison as a whole, with the exception of the year of randomisation, where the patients included in the metastatic burden sub-group analyses were enrolled in the latter years of recruitment to the comparison. Figure 1 shows a Consort diagram with full details of patient numbers included in the analyses.
The median duration of follow-up was 78.2 months (interquartile range (IQR): 62.9-96.3). There were 719 deaths reported; 494/724 (68%) control patients died compared with 225/362 (62%) docetaxel patients. Control group patients had a median survival of 43.1 months and an estimated 5-year survival of 37% (95% CI 34% to 41%), whereas patients receiving docetaxel had a median survival of 59.1 months and 5-year survival of 49% (95% CI 44% to 54%). There was good evidence of a benefit from docetaxel on survival (stratified log-rank test P ¼ 0.003, HR ¼ 0.81, 95% CI 0.69-0.95; Figure 2A). As there was evidence (P ¼ 0.016) of non-proportional hazards in the treatment effect, the interpretation of these results focuses on the difference in RMST between arms. This method showed evidence of a benefit of docetaxel, with an estimated difference of 6.0 months (95% CI 0.7-11.4) in RMST (over 120 months) between groups.
There was no evidence of heterogeneity of treatment effect on survival over metastatic burden sub-groups (interaction P ¼ 0.827). For the low-burden patients (n ¼ 362, deaths ¼ 166; Figure 2C), the median survival in control was 76.7 months, with 5-year survival 57% (95% CI 51% to 64%), compared with a median of 93.2 months and 5-year survival of 72% (95% CI 65% to 80%) for docetaxel. Results for high-burden patients (n ¼ 468, deaths ¼ 360) were similar in terms of docetaxel effect, although, as expected, the high-burden patients generally had shorter survival time than low-burden patients ( Figure 2E). The median survival for control was 35.2 months and the 5-year survival estimate of 24% (95% CI 20% to 29%), compared with 39.9 months with a 5-year survival of 34% (95% CI 27% to 42%) for docetaxel. The hazard ratios were consistent in the low-burden (HR ¼ 0.76, 95% CI 0.54-1.07, P ¼ 0.107) and high-burden (HR ¼ 0.81, 95% CI 0.64-1.02, P ¼ 0.064) sub-groups. The consistency of docetaxel treatment effect on survival across other baseline characteristics was examined as exploratory sub-group analyses, summarised in Figure 3. There is no good evidence that the docetaxel effect varies across any of the sub-groups included (nodal status, Gleason sum score, age or WHO performance score).
Treatment adherence to docetaxel was reported previously [1]; all patients had completed docetaxel treatment before the first efficacy analysis. Twenty-nine metastatic patients allocated to Docetaxel never reported starting chemotherapy. They are included in Control for toxicity analysis (see Figure 1) which is summarised in Table 4 and supplementary Table S2 and Figure  S1, available at Annals of Oncology online. A comparison of toxicity reported across groups in the first year of follow-up shows higher toxicity in docetaxel (42% docetaxel reported G3-5 toxicity versus 24% in Control). However, toxicity reports for subsequent follow-up, after the initial year, are balanced across groups (27% docetaxel reported G3-5 toxicity compared with 28%  control), with no good evidence of increased toxicity in the docetaxel group after the first year of follow-up. Table 5 shows some evidence that control patients were more likely to report starting second-line (or subsequent) treatment following disease progression, with 80% reporting starting at least one further line of treatment compared with 68% for Docetaxel. The types of further therapy reported are broadly similar between the two groups.
A sensitivity analysis was undertaken without the M1 patients retrospectively found not to have met all of the strict protocol eligibility criteria, mostly concerning a review of baseline blood pressure measurements. Removing these 120 patients (11%) did not change the primary outcome measure results HR ¼ 0.81 (95% CI 0.68-0.96; P ¼ 0.013).

Discussion
In this updated long-term analysis, the relative survival benefit for adding docetaxel to ADT in mHNPC confirmed our previous findings. A statistically and clinically significant improvement in survival and a delayed time to metastatic progression was demonstrated with the combination treatment compared with ADT alone. Importantly, this benefit is seen irrespective of metastatic burden, with no evidence of heterogeneity between the low and high metastatic burden sub-groups across any outcome measures. This reinforces the principle that ADT and docetaxel can be considered as an effective first-line treatment option for men with mHNPC regardless of metastatic burden.
Two other trials have also evaluated the combination of docetaxel with ADT over ADT alone in mHNPC. The first trial, GETUG-AFU-15, enrolled 385 patients and reported with median follow-up of 84 months. This showed no clear evidence of improvement in survival by adding docetaxel to ADT over ADT alone (HR ¼ 0.88, 95% CI 0.68-1.14, P ¼ 0.3) [19]. The second trial, CHAARTED, enrolled 790 patients and reported with median follow-up of 54 months, demonstrating clear evidence of an improvement in survival for adding docetaxel (HR ¼ 0.72, 95% CI 0.59-0.89, P ¼ 0.0018) [3,9]. STAMPEDE is the largest of the  However, in contrast to the CHAARTED and GETUG-15 trials, we found no good evidence in this study of a difference in benefit between the high and low metastatic burden sub-groups for survival and all other outcome measures. Indeed, the point estimate of the benefit for 'low-burden' patients was of 'greater' magnitude than that for the high-burden group. Prior sub-group analysis conducted as part of the CHAARTED and GETUG-15 trials suggested a smaller overall survival benefit associated with the combination of docetaxel þ ADT in patients with low metastatic burden compared with high metastatic burden [10]. Based on this, docetaxel was recommended by those authors as a firstline option only for high but not low-burden mHNPC [11]. That view was not universally accepted and the inherent limitations of the previous sub-group analyses can possibly explain the discordance to our new findings [14][15][16]. Nearly 25% of the patients enrolled in the other two trials presented with metastatic disease after previous radical treatment. That group of patients has a different natural history to those presenting with de-novo M1 disease [20,21]. Consequently, the low metastatic burden sub-groups in the CHAARTED and the GETUG-15 trials had fewer than 160 de-novo mHNPC patients each [10]. In our updated report, $95% of patients had de-novo M1, with a total of 362 patients in the low metastatic burden sub-group. This larger sample size provides a stronger basis for estimating a treatment effect in this sub-group. Furthermore, there was no evidence of a difference in benefit associated with docetaxel with ADT over ADT alone when considering metastatic burden (interaction P ¼ 0.827). Based on these findings, the combination of docetaxel with ADT should be a first-line treatment option for newly diagnosed mHNPC patients regardless of metastatic burden.
Our results have to be interpreted in light of more recentlyrecruited trials. Since the last report, a number of other systemic  treatments such as abiraterone, apalutamide and enzalutamide have also shown statistically and clinically significant survival benefits over SOC alone when used as first-line agents in mHNPC [4][5][6][7]. None of the trials using androgen receptor pathway targeting have shown any evidence of heterogeneity of effect by metastatic burden. Our previous analysis of patients randomised contemporaneously within STAMPEDE to a docetaxel group or an abiraterone group did not show any difference in overall survival. Whilst that was acknowledged as underpowered, opportunistic comparison, it remains the only direct randomised comparison of docetaxel and abiraterone, and the results are in keeping with both agents being valid first-line options when combined with ADT [22].

Conclusion
This updated report, with long-term follow-up and metastatic burden sub-group analysis, reinforces the benefits of adding docetaxel to ADT in mHNPC. The combination treatment (3) (1)      2 <1% 1 <1% 1 <1% 0 0% No FU/SAE reported 18 n/a 1 n/a 18 n/a 1 n/a Not on FU after one year n/a n/a n/a n/a 281 n/a 77 n/a Total b 753 100% 333 100% 753 100% 333 100% a Timed from randomisation. b Total numbers shown for safety population, where 29 patients allocated to the Docetaxel Group never started Docetaxel treatment and are therefore included in the standard-of-care (SOC) group for safety reporting. Note that 'missing' data refers to patients who did not report AE data after this point (either died or withdrawn from the trial, or not reporting AEs after disease progression as specified in the trial protocol).