Effects of dose change on the success of clinical trials

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Abstract

The search for disease modifying therapies in Alzheimers disease (AD) has recently led to promising results but also revealed design issues in clinical trials themselves. Of particular importance is the potential statistical challenges that can arise when dosages change after an interim analysis, which is not uncommon in contemporary AD trials. Following the recent Aducanumab trials, we sought to study the implications of dose changes on the statistical power of an AD trial. We conducted extensive simulations to calculate statistical power when the relationship between treatment effect size and time is linear or non-linear, and the investigated drug has delayed treatment effect or not. Statistical power depends on many design factors including the dose change time, correlation, population homogeneity, and treatment effect time. We recommend that researchers conduct simulation studies at the interim analysis to justify the modified sample size and/or follow-up time modification meanwhile the type I and II error rates are controlled.

Keywords: Aducanumab trial, Alzheimer’s disease, Dose change, Statistical power

1. Introduction

Clinical trial designs for disease modifying therapies (DMTs) for Alzheimer’s disease (AD) have become increasingly complex. Several clinical trial programs including those involving gantenerumab [1], solanezumab (A4 study) and aducanumab (EMERGE and ENGAGE studies) have incorporated mid-trial dose adjustments [2], [3], [4]. The Dominantly Inherited Alzheimer Network-Trials Unit (DIAN-TU) reported their first trial for patients with dominantly inherited Alzheimer’s disease (DIAD) treated with gantenerumab or solanezumab (two anti-amyloid monoclonal antibodies) [1]. During that trial, the doses of both drugs were escalated. Interim analysis or new information has led to alterations in drug dosing during the study that were not part of the initial trial design [5], [6]. Given the time frame needed to show disease modification in AD clinical trials (usually a minimum of 18 months), and the expenses associated with such large-scale and complicated studies, interim analyses [7] are often done for a number of reasons including to: (1) gauge the likelihood of success of the agent and (2) modify the trial if needed, including dose adjustment. Reasons for changing the dose within a given study vary, but the rates of adverse events and differences in rates of target engagement to biomarker effects are often cited as rationale [8]. Equally important as the methodological changes, particular attention must also be paid to the potential impact of dose changes on the originally formulated statistical plans. Failure to properly account for the effects of dose changes, as was seen in EMERGE/ENGAGE trials [2], [3], [4], can have far reaching effects on study outcomes.

The Emerge and Engage phase III trials in particular point out the difficulties with instituting dose changes during a clinical trial [4], [9], [10]. In the aducanumab trial, there initially were a total of three groups: placebo, low dose, and high dose. Based on the information gathered during the study, the low dose group had their dose increased to the high dose. However, one of the statistical challenges raised from the aducanumab trial was that the trial duration for participants with the dose change remained the same as did the timing of an interim futility analysis. While the trail was subsequently stopped for futility the final analysis which included additional follow up showed a positive result for the primary outcome measure in one of the two trials [11], [12].

As more potential AD DMTs go into clinical trials, and often earlier in the disease process, it is critical to develop statistical methods that are able to adapt to the data that is collected and analyzed during the study itself and prospectively integrate these methods at the outset. We propose a means to calculate how the sample size or duration of a trial could be modified in response to changes made in dosing during a clinical trial. The purpose of this paper is to examine the effects of dose changes on study design and power analyses and to describe methodology for modeling dose changes. Finally, we apply our approach to the experience of the EMERGE/ENGAGE studies [2], [3], [4] to demonstrate the importance of correctly modeling dose changes.

2. Methods

In this section we propose a method for modeling the effects of dose change on clinical trial designs. Suppose T s is the originally planned total follow up time, and T l is the time when a patient is switched from low dose to high dose, see Fig. 1 . The time on high dose is T h = T s − T l . Over the course of the trial, the accumulated total doses for patient switched from low dose to high dose, are less than the total doses for patients from the original high dose group because of the lower dose from T 0 (baseline) to T l . The total dose difference between these two subpopulations depends on the time of dose change T l . When T l is close to baseline, their difference becomes very small. Otherwise, T l is close to T s , and a big difference would be expected. When dose change is scheduled to occur during an on-going trial based on the results from an interim analysis, it is critical to understand how this dose modification would affect the statistical power.

An external file that holds a picture, illustration, etc. Object name is gr1.jpg

Dose change flow chart.

In reality, participants with dose changes are often included in the high dose group because their final dose at the time of T s is high dose [1], [2], [3], [6], [13]. However, including these participants in analyses and treating them the same as those originally randomized to the high dose group, without accounting for the lower accumulated dose, may ultimately decrease the overall treatment effect for the high dose group. If enough patients are changed from a low to a high dose in a trial, the overall impact would increase the likelihood of reducing the treatment effect and lead to the failure of an investigational new drug. In addition, some drugs might have delayed treatment effects on improving symptoms, delaying the onset, and slowing the progression. Then, the treatment effect at time T s for patients with dose change could be smaller than that for patients who are always on high dose. Therefore, it is important to adjust the follow-up time for patients with dose change in a study when the planned dose has been changed. This effect may be even more pronounced for drugs that might have a delayed treatment effect, which could further mask the disease modifying effects of treatment.

In accounting for mid trial dosage changes we propose extending the follow-up time from T s to T m with additional time T a = T m − T s on high dose, as illustrated in Fig. 1 . As mentioned above if participants who are switched are only analyzed at T s their cumulative treatment exposure is lower than those on high dose the duration of the trial and thus not comparable. The additional time T a would thus be added for participants who have undergone a dose change to increase their dose exposure at time T m (blue in Fig. 2 ), where T m would account for the time on lower dose and match cumulative dose exposure to those who were started at the higher doses at baseline (red in Fig. 2 ).

An external file that holds a picture, illustration, etc. Object name is gr2.jpg

Treatment effect sizes for a study with or without dose change.

Multiple other factors contribute to the estimation of treatment effect at T s for patients always on high dose, and that at T m for patients with dose change. Suppose the parameter of interest is the change of the primary outcome from T 0 to T s in designing clinical trials. The estimated difference between groups is the primary factor, Δ . Correlation between outcome at T 0 and that at T s , r , is another important factor. Correlation affects the sample size calculation in a before–after study [14]. In addition, variances of outcome at baseline and the end are another two factors in power analysis. The estimated values from these four parameters are traditionally used in sample size calculation. Suppose Y 0 and Y s are the outcomes at T 0 and T s . The sample size calculation based on the score change Δ = E ( Y s − Y 0 ) [15], which is presented as

N = 2 ( Z 1 − α / 2 + Z 1 − β ) 2 s 2 Δ 2 ,

where s 2 = v a r ( Y 0 ) + v a r ( Y s ) − 2 r v a r ( Y 0 ) × v a r ( Y s ) is the variance of Y s − Y 0 , and Z b is the b th quantile of the standard normal distribution.

Other factors could affect the statistical power, including the relationship between treatment effect and time: linear or non-linear for a drug with delayed treatment effect, and the ratio of patients who are always on high dose and those with dose change. Finally, the modified time T m for patients with dose change is another factor to consider. These factors are studied in the following section in the power analysis.

3. Numerical results

Suppose we conduct a hypothetical study with a dose change incorporated during the conduct of the study. In this study there are three arms: a low dose group, a high dose group, and a placebo group. Suppose it is a randomized balanced clinical trial with sample size of N = 300 per group.

3.1. Model for linear relationship

The estimated mean value of the primary outcome at baseline for the three groups is assume to be the same as E ( Y 0 ) = 15 . For purposes of illustration, higher scores indicate worse performance (e.g., Alzheimer’s Disease Assessment Scale–Cognitive subscale (ADAS-Cog)). Let the scheduled follow-up time be T s = 24 months. After 24 months, suppose the placebo group declines by 6 points ( E ( Y 24 ) = 21 ), the low dose group by 4 points ( E ( Y 24 ) = 19 ), and the high dose group by 2 points ( E ( Y 24 ) = 17 ).

At baseline, patients are more likely to be similar across the groups. Standard deviation (SD) at baseline is assumed to be the same for the three group: σ 0 = 6 . In the placebo group, the SD is assumed to be constant over the trial. In the treatment groups, however, variance in the primary outcome could increase as time goes on. The SD value at T s is σ l = 7 . 2 in the low dose group, and σ h = 7 . 8 in the high dose group. Six correlation coefficients between Y 0 and Y T s are studied: ρ = 0 . 05 , 0.10, 0.15, 0.20, 0.35, and 0.55.

In Fig. 3 , we present the computed statistical power for a study with dose change as a function of the modified follow-up time under the assumption of linear treatment effect within each group. The outcome at time T b ∈ [ 0 , T s ] is E ( Y b ) = E ( Y s − Y 0 ) / T s × T b . For patients with dose change, we assume that the change time for these patients is uniformly distributed from baseline to T s . We compute power for a study with 3 different proportions of patients who are always on high dose: 60 (20%), 150 (50%), and 240 (80%). When 80% patients are always on high dose, the remaining 20% patients have dose change from a low dose to a high dose during the study. The far left value (“High” in the x-axis) is the statistical power for comparing the placebo group with the high dose group. In this configuration, all patients in the high dose group received the high dose at baseline.

An external file that holds a picture, illustration, etc. Object name is gr3.jpg

Power of a study with dose change as a function of additional follow-up time under the assumption of linear treatment effect, when the proportion of high dose patients is 20%, 50%, and 80%. Six correlation values are studied: ρ = 0 . 05 , 0.10, 0.15, 0.20, 0.35, and 0.55. “High” on the far left is for the case that all patients are in the high dose group without any dose change.

When the extended follow-up time is 0, this is a study that conducts the interim analysis as originally planned. As a result, patients with dose change do not expose to the same total dosing as those always on high dose. We expect that the observed treatment effect size could be smaller from patients with dose change, as compared to the original high dose group. As the extended follow-up time increases in the dose change group, the statistical power increases when the proportion of patients always on high dose is not too large (e.g., 20%, 50%), and it is close to that for a study without dose change when additional T a = 12 months are added in the follow up. Within each plot, we observe that power of a study goes up when more patients are from the high dose group without dose change. Between these plots, statistical power increases as correlation goes up.

3.2. Model for delayed treatment effect

Many agents for neurodegenerative diseases may have delayed treatment effects. This is important to account for that effect when modeling dose changes. For this model we assumed a non-linear relationship in the treatment group (but not in the placebo) with the baseline value of E ( Y 0 ) = 22 in this subsection. In the placebo group, a linear treatment effect is assumed with E ( Y 24 ) = 27 from 22 at baseline. While in the two treatment groups, we assume the treatment effect follows an exponential distribution with the rate parameter λ = 0 . 4 , with the relationship between the outcome and time as: f ( t , θ ) = 22 + ( 1 − exp λ t / 10 ) × θ , where t is time by month, θ is a scale parameter, and 1 − exp λ t / 10 is the cumulative distribution function of an exponential distribution. The scale parameter can be determined by solving the following equation: f ( 24 , θ ) = 26 . 32 for the low dose group having θ = − 2 . 68 , and 25.46 for the high dose group with θ = − 2 . 15 . The three curves are presented in Fig. 4 .

An external file that holds a picture, illustration, etc. Object name is gr4.jpg

Non linear curve.

In Fig. 5 , we present the statistical power under the assumption of a non-linear treatment effect [1] given the proportion of patients who are always on high dose as 60 (20%), 150 (50%), and 240 (80%). Six correlation values from the linear treatment effect configurations are also studied here. It can be seen from the figure that power increases as the modified follow-up time goes up, and a study with a higher correlation often has a higher statistical power. When the extended time is long enough (e.g., 12 months), power of a study with a low correlation could have a slight higher power than studies with higher correlations.

An external file that holds a picture, illustration, etc. Object name is gr5.jpg

Power of a study with dose change as a function of additional follow-up time under the assumption of non-linear treatment effect, when the proportion of high dose patients is 20%, 50%, and 80%. Six correlation values are studied: ρ = 0 . 05 , 0.10, 0.15, 0.20, 0.35, and 0.55. “High” on the far left is for the case that all patients are in the high dose group without any dose change.

3.3. Studies with different variances

Neurodegenerative diseases are multifactorial diseases that may be effected by unaccounted for factors such as genetics, medical comorbidities, pharmacokinetics/dynamics likely resulting in differential response rates to treatment. In Fig. 6 , we set the estimated mean value of the primary outcome at baseline for the three groups as E ( Y 0 ) = 22 . After 24 months, suppose the placebo group declines by 6 points ( E ( Y 24 ) = 28 ), the low dose group by 5 points ( E ( Y 24 ) = 27 ), and the high dose group by 4 points ( E ( Y 24 ) = 26 ).

An external file that holds a picture, illustration, etc. Object name is gr6.jpg

Power of a study with dose change as a function of additional follow-up time under the assumption of linear treatment effect when ρ = 0 . 1 on the first row and 0.2 on the second row, for ( σ 0 , σ l , σ h ) = ( 6 . 0 , 6 . 0 , 6 . 6 ) for small variances, (6.0, 6.6, 7.2) for medium variances, and (6.0, 7.8, 8.4) for large variances. “High” on the far left is for the case that all patients are in the high dose group without any dose change.

In Fig. 6 , we present the statistical power as a function of the extended follow-up time for studies with different variances: ( σ 0 , σ l , σ h ) = ( 6 . 0 , 6 . 0 , 6 . 6 ) for small variances, (6.0, 6.6, 7.2) for medium variances, and (6.0, 7.8, 8.4) for large variances. As we expected, studies with larger variances, both for the low and high dose groups, have smaller statistical power. When the extended time increase the estimated statistical power increases. Given the proportion of participants on high dose, a larger correlation is often associated with higher power. When more participants on the high dose (e.g. 50% on the right plots) additional follow up time after 3 months only slightly increases the statistical power. As the proportion of participants on high dose increases its statistical power is much higher when Tm is short, and its advantage goes away when the extended follow up time goes longer.

3.4. Example

If we apply the methodology we describe in the previous sections we can begin to get a clearer picture of how the EMERGE and ENGAGE studies [2], [3], [4] failed their futility analysis. The trials were designed with a total of three arms: a high dose treatment arm, a low dose treatment arm, and a placebo group. The primary outcome of the studies was the change in Clinical Dementia Rating Scale-Sum of Boxes (CDR-SOB) at week 78 from baseline. The CDR-SOB is a composite index summed from six cognitive and functional domains: memory, orientation, judgment and problem-solving, community affairs, home and hobbies and personal care. The range of CDR-SOB is 0–18 with higher scores indicating greater impairment.

For these studies the expected increase in of the CDR-SOB score was 2 in the placebo group. The assumed baseline value was 2.45 all groups. Based on data from earlier Phase Ib studies, an increase of 1.74 was expected in the low dose treatment arm and 1.5 in the high dose treatment arm. The improvement of 0.5 in the high dose group represents a 25% reduction in the primary outcome (e.g. clinical worsening or slowed rate of decline).

In our model we apply several assumptions: variances in CDR-SOB at baseline of 1.50 in all groups; at 78 weeks variances of 1.50 in the placebo group, 1.55 in the low dose group, and 1.60 in the high dose group. Correlation of CDR-SOB at baseline at week 78 is assumed to be the same across the groups, we study four correlation coefficients: 0.05, 0.1, 0.2, 0.30, and 0.40. Suppose in the high dose group, only 60% of participants (240 out of 400 for a study with 400 per arm) were actually exposed to high dose from baseline to week 78.

Using a sample size of 400 per group to calculate the statistical power in Fig. 7 with or without dose change the computed statistical power increases as the extended follow-up time goes up. For this particular example, power is similar to each when the extended follow up time is 3 months or more for each configuration. Power for a study with the extended time of 9 months is slightly below the study with participants only on the high dose. When the correlation is 10% or above, a study with 3 months extended time has a power above 90%. For a study with a low correlation (e.g. 0.05) the estimated power for a study with a dose change could be below 90% even with an extended follow up.

An external file that holds a picture, illustration, etc. Object name is gr7.jpg

Example from the aducanumab trial using the CDR-SB score. “High” on the far left is for the case that all patients are in the high dose group without any dose change.