The Continual Reassessment Method (CRM) for dose-finding is introduced in the CRM vignette. This vignette reproduces the analysis in a real dose-finding trial that used the CRM by Levy et al. (2006). The chosen example is particularly attractive because the authors present sequential analyses of their dose decisions in their Table 1, allowing us the opportunity to reproduce their analysis.
The authors investigate five doses of semisynthetic homoharringtonine (ssHHT), seeking a dose associated with a dose-limiting toxicity (DLT) probability of approximately 33%. The specific doses and the investigators’ initial beliefs on the probabilities of DLT are:
library(dplyr)
tibble(
`Dose-level` = 1:5,
`Dose (mg/m2 per day)` = c(0.5, 1, 3, 5, 6),
`Prior Pr(DLT)` = c(0.05, 0.1, 0.15, 0.33, 0.5)
%>% knitr::kable() )
Dose-level | Dose (mg/m2 per day) | Prior Pr(DLT) |
---|---|---|
1 | 0.5 | 0.05 |
2 | 1.0 | 0.10 |
3 | 3.0 | 0.15 |
4 | 5.0 | 0.33 |
5 | 6.0 | 0.50 |
We are not told the exact details of the model form or the parameter prior(s) used. However, we are told that the BPCT software was used to calculate the recommended doses. Zohar et al. (2003) describe the software as using a one-parameter logistic model with fixed-value intercept taking possible values 1, 2, 3 or 4, and exponential or uniform priors on the gradient term with expected value 0.5, 1 or 2. For the purposes of this article, we assume that the investigators used a one-parameter logistic CRM with the intercept term fixed at \(a_0 = 1\) and an Exponential(1) = Gamma(1, 1) prior on \(\beta\). In the appendix, we describe an attempt to identify the exact parameterisation used by comparing a broad array of model fits.
library(trialr)
<- 0.33
target <- c(0.05, 0.1, 0.15, 0.33, 0.5)
skeleton
<- crm_prior_beliefs(skeleton, target, model = 'logistic_gamma',
fit0 a0 = 1, beta_shape = 1, beta_inverse_scale = 1)
fit0#> No patients have been treated.
#>
#> Dose Skeleton N Tox ProbTox MedianProbTox ProbMTD
#> 1 1 0.05 0 0 0.250 0.165 0.3920
#> 2 2 0.10 0 0 0.290 0.245 0.0602
#> 3 3 0.15 0 0 0.321 0.307 0.1047
#> 4 4 0.33 0 0 0.420 0.467 0.1862
#> 5 5 0.50 0 0 0.527 0.583 0.2567
#>
#> The model targets a toxicity level of 0.33.
#> The dose with estimated toxicity probability closest to target is 3.
#> The dose most likely to be the MTD is 1.
#> Model entropy: 1.43
Notice that both the prior mean (ProbTox
) and median
(MedianProbTox
) probabilities of toxicity are quite far
from the skeleton.
In an ordinary regression model, fixing the intercept will affect the gradient. The same is true here:
library(tidyr)
library(purrr)
library(ggplot2)
<- function(a0) {
get_prior_fit crm_prior_beliefs(skeleton, target,
model = 'logistic_gamma', a0 = a0,
beta_shape = 1, beta_inverse_scale = 1)
}
tibble(a0 = c(-1, 0, 1, 2, 3, 4)) %>%
mutate(Mod = map(a0, get_prior_fit)) %>%
mutate(
Dose = Mod %>% map("dose_indices"),
ProbTox = Mod %>% map("prob_tox"),
%>%
) select(-Mod) %>%
unnest(cols = c(Dose, ProbTox)) %>%
mutate(a0 = factor(a0)) %>%
ggplot(aes(x = Dose, y = ProbTox, group = a0, col = a0)) +
geom_line() +
ylim(0, 1) +
labs(title = 'Prior Prob(DLT) location is affected by the fixed intercept, a0')
Please note that the behaviour of unnest
changed in v1.0
of the tidyr
package. For older versions, the command
should be:
tibble(a0 = c(-1, 0, 1, 2, 3, 4)) %>%
mutate(Mod = map(a0, get_prior_fit)) %>%
mutate(
Dose = Mod %>% map("dose_indices"),
ProbTox = Mod %>% map("prob_tox"),
%>%
) select(-Mod) %>%
unnest(cols = c(Dose, ProbTox)) %>%
mutate(a0 = factor(a0)) %>%
ggplot(aes(x = Dose, y = ProbTox, group = a0, col = a0)) +
geom_line() +
ylim(0, 1) +
labs(title = 'Prior Prob(DLT) location is affected by the fixed intercept, a0')
The trial starts at the lowest dose, 0.5 mg/m2 per day. Three patients are treated and none experiences DLT.
<- stan_crm(outcome_str = '1NNN',
fit1 skeleton = skeleton, target = target, model = 'logistic_gamma',
a0 = 1, beta_shape = 1, beta_inverse_scale = 1,
seed = 123, control = list(adapt_delta = 0.99))
fit1#> Patient Dose Toxicity Weight
#> 1 1 1 0 1
#> 2 2 1 0 1
#> 3 3 1 0 1
#>
#> Dose Skeleton N Tox ProbTox MedianProbTox ProbMTD
#> 1 1 0.05 3 0 0.083 0.0214 0.0865
#> 2 2 0.10 0 0 0.117 0.0517 0.0398
#> 3 3 0.15 0 0 0.149 0.0876 0.1120
#> 4 4 0.33 0 0 0.266 0.2520 0.2940
#> 5 5 0.50 0 0 0.414 0.4446 0.4677
#>
#> The model targets a toxicity level of 0.33.
#> The dose with estimated toxicity probability closest to target is 4.
#> The dose most likely to be the MTD is 5.
#> Model entropy: 1.30
We see that the cohort has a great bearing on the predicted rates of
toxicity. The trialists report estimated DLT probabilities of 0.001,
0.003, 0.006, 0.035, 0.11. The values calculated by trialr
diverge somewhat from those values. It is difficult to know why this is
given such a small sample size.
Dose-level 5 is recommended for the next cohort. The trialists understandably resist the desire to skip three doses in going from the lowest to highest dose, electing instead to skip the second dose and treat the next cohort at dose-level 3.
This cohort of three was treated at 3 mg/m2 per day. One DLT was seen.
<- stan_crm(outcome_str = '1NNN 3NNT',
fit2 skeleton = skeleton, target = target, model = 'logistic_gamma',
a0 = 1, beta_shape = 1, beta_inverse_scale = 1,
seed = 123, control = list(adapt_delta = 0.99))
fit2#> Patient Dose Toxicity Weight
#> 1 1 1 0 1
#> 2 2 1 0 1
#> 3 3 1 0 1
#> 4 4 3 0 1
#> 5 5 3 0 1
#> 6 6 3 1 1
#>
#> Dose Skeleton N Tox ProbTox MedianProbTox ProbMTD
#> 1 1 0.05 3 0 0.147 0.101 0.1588
#> 2 2 0.10 0 0 0.203 0.171 0.0963
#> 3 3 0.15 3 1 0.250 0.230 0.2190
#> 4 4 0.33 0 0 0.397 0.407 0.3842
#> 5 5 0.50 0 0 0.533 0.548 0.1417
#>
#> The model targets a toxicity level of 0.33.
#> The dose with estimated toxicity probability closest to target is 4.
#> The dose most likely to be the MTD is 4.
#> Model entropy: 1.49
The trialists report Prob(DLT) = (0.07, 0.14, 0.19, 0.39, 0.55). The
median estimates from trialr
are now quite close to these
values. Despite the observation of a DLT and the observed DLT-rate at
dos-level 3 matching the target of 33%, the model advocates escalation
to dose-level 4. That is what the investigators did.
This cohort of three was treated at dose-level 4, which corresponds to 5 mg/m2 per day. Once again, one DLT was seen.
<- stan_crm(outcome_str = '1NNN 3NNT 4NNT',
fit3 skeleton = skeleton, target = target, model = 'logistic_gamma',
a0 = 1, beta_shape = 1, beta_inverse_scale = 1,
seed = 123, control = list(adapt_delta = 0.99))
fit3#> Patient Dose Toxicity Weight
#> 1 1 1 0 1
#> 2 2 1 0 1
#> 3 3 1 0 1
#> 4 4 3 0 1
#> 5 5 3 0 1
#> 6 6 3 1 1
#> 7 7 4 0 1
#> 8 8 4 0 1
#> 9 9 4 1 1
#>
#> Dose Skeleton N Tox ProbTox MedianProbTox ProbMTD
#> 1 1 0.05 3 0 0.129 0.0904 0.1027
#> 2 2 0.10 0 0 0.186 0.1569 0.0875
#> 3 3 0.15 3 1 0.235 0.2153 0.2370
#> 4 4 0.33 3 1 0.389 0.3935 0.4557
#> 5 5 0.50 0 0 0.531 0.5402 0.1170
#>
#> The model targets a toxicity level of 0.33.
#> The dose with estimated toxicity probability closest to target is 4.
#> The dose most likely to be the MTD is 4.
#> Model entropy: 1.40
The trialists report Prob(DLT) = (0.07, 0.13, 0.19, 0.38, 0.54) after this cohort, barely shifting from the previous estimates. This time the model advocates to remain at dose-level 4 and that is exactly what the trialists did.
This cohort of three was also treated at dose-level 4. No DLTs were seen in this cohort.
<- stan_crm(outcome_str = '1NNN 3NNT 4NNT 4NNN',
fit4 skeleton = skeleton, target = target, model = 'logistic_gamma',
a0 = 1, beta_shape = 1, beta_inverse_scale = 1,
seed = 123, control = list(adapt_delta = 0.99))
fit4#> Patient Dose Toxicity Weight
#> 1 1 1 0 1
#> 2 2 1 0 1
#> 3 3 1 0 1
#> 4 4 3 0 1
#> 5 5 3 0 1
#> 6 6 3 1 1
#> 7 7 4 0 1
#> 8 8 4 0 1
#> 9 9 4 1 1
#> 10 10 4 0 1
#> 11 11 4 0 1
#> 12 12 4 0 1
#>
#> Dose Skeleton N Tox ProbTox MedianProbTox ProbMTD
#> 1 1 0.05 3 0 0.0723 0.0428 0.0328
#> 2 2 0.10 0 0 0.1184 0.0886 0.0338
#> 3 3 0.15 3 1 0.1616 0.1361 0.1452
#> 4 4 0.33 6 1 0.3191 0.3145 0.5320
#> 5 5 0.50 0 0 0.4842 0.4896 0.2562
#>
#> The model targets a toxicity level of 0.33.
#> The dose with estimated toxicity probability closest to target is 4.
#> The dose most likely to be the MTD is 4.
#> Model entropy: 1.19
The trialists report Prob(DLT) = (0.03, 0.07, 0.11, 0.27, 0.45) after
this cohort. ProbMTD
shows the implied probability that
each dose is the maximum tolerable dose, that is, the dose with
Prob(DLT) closest to the toxicity target, 33%. The observation of no
DLTs in this cohort means it is now very unlikely that dose-levels 1 and
2 are the true MTD. The amount of entropy in the experiment has fallen
to reflect this. Once again, the model advocates to remaining at
dose-level 4.
This cohort was also treated at dose-level 4. One-out-of-three DLTs were seen.
<- stan_crm(outcome_str = '1NNN 3NNT 4NNT 4NNN 4NTN',
fit5 skeleton = skeleton, target = target, model = 'logistic_gamma',
a0 = 1, beta_shape = 1, beta_inverse_scale = 1,
seed = 123, control = list(adapt_delta = 0.99))
fit5#> Patient Dose Toxicity Weight
#> 1 1 1 0 1
#> 2 2 1 0 1
#> 3 3 1 0 1
#> 4 4 3 0 1
#> 5 5 3 0 1
#> 6 6 3 1 1
#> 7 7 4 0 1
#> 8 8 4 0 1
#> 9 9 4 1 1
#> 10 10 4 0 1
#> 11 11 4 0 1
#> 12 12 4 0 1
#> 13 13 4 0 1
#> 14 14 4 1 1
#> 15 15 4 0 1
#>
#> Dose Skeleton N Tox ProbTox MedianProbTox ProbMTD
#> 1 1 0.05 3 0 0.0726 0.0501 0.0255
#> 2 2 0.10 0 0 0.1212 0.1001 0.0260
#> 3 3 0.15 3 1 0.1667 0.1502 0.1507
#> 4 4 0.33 9 2 0.3288 0.3302 0.6085
#> 5 5 0.50 0 0 0.4927 0.5001 0.1893
#>
#> The model targets a toxicity level of 0.33.
#> The dose with estimated toxicity probability closest to target is 4.
#> The dose most likely to be the MTD is 4.
#> Model entropy: 1.09
The trialists report Prob(DLT) = (0.04, 0.08, 0.12, 0.29, 0.46).
This cohort was also treated at dose-level 4. Two-out-of-three DLTs were seen.
<- stan_crm(outcome_str = '1NNN 3NNT 4NNT 4NNN 4NTN 4TNT',
fit6 skeleton = skeleton, target = target, model = 'logistic_gamma',
a0 = 1, beta_shape = 1, beta_inverse_scale = 1,
seed = 123, control = list(adapt_delta = 0.99))
fit6#> Patient Dose Toxicity Weight
#> 1 1 1 0 1
#> 2 2 1 0 1
#> 3 3 1 0 1
#> 4 4 3 0 1
#> 5 5 3 0 1
#> 6 6 3 1 1
#> 7 7 4 0 1
#> 8 8 4 0 1
#> 9 9 4 1 1
#> 10 10 4 0 1
#> 11 11 4 0 1
#> 12 12 4 0 1
#> 13 13 4 0 1
#> 14 14 4 1 1
#> 15 15 4 0 1
#> 16 16 4 1 1
#> 17 17 4 0 1
#> 18 18 4 1 1
#>
#> Dose Skeleton N Tox ProbTox MedianProbTox ProbMTD
#> 1 1 0.05 3 0 0.104 0.0751 0.0490
#> 2 2 0.10 0 0 0.161 0.1364 0.0653
#> 3 3 0.15 3 1 0.211 0.1925 0.2313
#> 4 4 0.33 12 4 0.374 0.3728 0.5647
#> 5 5 0.50 0 0 0.524 0.5275 0.0897
#>
#> The model targets a toxicity level of 0.33.
#> The dose with estimated toxicity probability closest to target is 4.
#> The dose most likely to be the MTD is 4.
#> Model entropy: 1.20
The trialists report Prob(DLT) = (0.06, 0.12, 0.17, 0.36, 0.53). Prob(MTD) actually decreased and the scenario entropy increased after the evaluation of these three patients, because the observation of two-in-three DLTs was slightly surprising to the model. Nevertheless, the trialists stopped here and concluded that 5 mg/m2 per day was probably the MTD. They reported a 95% credibility interval for the DLT rate at this dose to be (15.8, 58.6%). We can verify this:
apply(as.data.frame(fit6, pars = 'prob_tox'), 2, quantile,
probs = c(0.025, 0.975))
#> prob_tox[1] prob_tox[2] prob_tox[3] prob_tox[4] prob_tox[5]
#> 2.5% 0.007936399 0.02356285 0.04562227 0.1788696 0.3828175
#> 97.5% 0.339954375 0.41377685 0.46175719 0.5694487 0.6406686
The 95% CI at dose-level 4 is very close to that reported.
They say “The estimated DLT probability associated with the dose level of 5mg/m2 would have been expected to change by less than 5% even if three further patients were included.” It is possible to show this with DTPs (see for example the dose pathways vignette).
The prevailing probability of toxicity given the patients evaluated thus far is:
<- fit6$prob_tox[fit6$recommended_dose]
prob_tox_mtd
prob_tox_mtd#> [1] 0.3743222
We calculate the future pathways for a single additional cohort of 3 patients, conditional on the outcomes observed hitherto:
<- crm_dtps(skeleton = skeleton, target = target,
paths model = 'logistic_gamma', cohort_sizes = c(3),
previous_outcomes = '1NNN 3NNT 4NNT 4NNN 4NTN 4TNT',
a0 = 1, beta_shape = 1, beta_inverse_scale = 1,
seed = 123, control = list(adapt_delta = 0.99), refresh = 0)
library(tibble)
<- as_tibble(paths) df
The putative future inferences on the probability of toxicity at dose-level 4 conditional on each of the four possible cohort outcomes are:
library(dplyr)
library(purrr)
library(tidyr)
%>%
df mutate(prob_tox = map(fit, 'prob_tox')) %>%
select(-fit, -parent_fit) %>%
unnest(cols = c(dose_index, prob_tox)) %>%
filter(dose_index == 4)
#> # A tibble: 5 × 7
#> .node .parent .depth outcomes next_dose dose_index prob_tox
#> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 NA 0 "" 4 4 0.374
#> 2 2 1 1 "NNN" 4 4 0.334
#> 3 3 1 1 "NNT" 4 4 0.370
#> 4 4 1 1 "NTT" 4 4 0.410
#> 5 5 1 1 "TTT" 3 4 0.443
We can put this data into a format conducive to reproducing the trialists’ claim.
%>%
df filter(.depth > 0) %>%
mutate(prob_tox = map(fit, 'prob_tox')) %>%
select(-fit, -parent_fit) %>%
unnest(cols = c(dose_index, prob_tox)) %>%
filter(dose_index == 4) %>%
select(outcomes, prob_tox) %>%
bind_cols(
lik = dbinom(x = 0:3, size = 3, prob = prob_tox_mtd)) -> future_scenario
future_scenario#> # A tibble: 4 × 3
#> outcomes prob_tox lik
#> <chr> <dbl> <dbl>
#> 1 NNN 0.334 0.245
#> 2 NNT 0.370 0.440
#> 3 NTT 0.410 0.263
#> 4 TTT 0.443 0.0524
Above we see the likelihood of the four possible outcomes, inferred
using binomial probabilities and the prevailing probability of toxicity
at dose-level 4. The outcome TTT
would increase the
probability by over 5% but that scenario is unlikely, given what we
know.
The expected change is indeed less than 5%:
%>%
future_scenario mutate(prob_tox_change = abs(prob_tox - prob_tox_mtd)) %>%
summarise(expected_change = sum(lik * prob_tox_change))
#> # A tibble: 1 × 1
#> expected_change
#> <dbl>
#> 1 0.0244
This concludes the main case study on Levy et al. (2006). The section below details an honest but ultimately unsuccessful attempt to infer the precise parameterisation used by the trialists.
At the end of the trial, the investigators reported the estimated
probabilities of DLT (0.06, 0.12, 0.17, 0.36, 0.53). We will fit models
using an exponential prior and each combination of
a_0 = 1, 2, 3, 4
and
beta_inverse_scale = 0.5, 1, 2
to the complete set of all
outcomes observed. We seek the fit that yields inference closest to that
of the investigators.
The investigators concluded:
<- tibble(
levy_reported Dose = 1:5,
ProbTox = c(0.06, 0.12, 0.17, 0.36, 0.53),
)
We also define a helper function to fit the models:
<- function(outcomes, a0, beta_inverse_scale) {
fit_levy_crm stan_crm(outcome_str = outcomes,
skeleton = skeleton, target = target,
model = 'logistic_gamma',
a0 = a0, beta_shape = 1,
beta_inverse_scale = beta_inverse_scale,
control = list(adapt_delta = 0.99),
seed = 123, refresh = 0)
}
This code block calculates the parameter combinations, fits the model to each, and extracts the posterior mean probability of DLT:
expand.grid(a0 = 1:4, beta_inverse_scale = c(0.5, 1, 2)) %>%
mutate(Series = rownames(.)) %>%
mutate(Mod = map2(a0, beta_inverse_scale, fit_levy_crm,
outcomes = '1NNN 3NNT 4NNT 4NNN 4NTN 4TNT')) %>%
mutate(
Dose = Mod %>% map("dose_indices"),
ProbTox = Mod %>% map("prob_tox"),
%>%
) select(-Mod) %>%
unnest(cols = c(Dose, ProbTox)) %>%
mutate(a0 = factor(a0),
beta_inverse_scale = factor(beta_inverse_scale)) -> all_fits
We then plot our inferences with the investigators’ inferences
superimposed in bright green. Plots are grouped by the value for
beta_inverse_scale
in columns, and values for
a0
are reflected by colour:
%>%
all_fits ggplot(aes(x = Dose, y = ProbTox)) +
geom_line(aes(group = Series, col = a0)) +
geom_line(data = levy_reported, col = 'green', size = 1.2) +
facet_wrap(~ beta_inverse_scale) +
ylim(0, 0.7) +
labs(title = "None of the exponential models quite matches the investigators' inferences")
We see that there is broad agreement at the higher doses but none of
the series quite matches the investigators’ inferences. There are many
possible explanations for the difference. The investigators might not
have used an exponential prior. They might have use parameters we have
not tested. They might have reported some statistic other than the
posterior mean. trialr
or their code or indeed both might
be wrong. Either way, there is enough agreement to agree on the probable
identity of the MTD.
There are many vignettes illustrating the CRM and other dose-finding
models in trialr
. Be sure to check them out.
trialr
and the escalation
packageescalation
is an R package that provides a grammar for specifying dose-finding
clinical trials. For instance, it is common for trialists to say
something like ‘I want to use this published design… but I want it to
stop once \(n\) patients have been
treated at the recommended dose’ or ‘…but I want to prevent dose
skipping’ or ‘…but I want to select dose using a more risk-averse metric
than merely closest-to-target’.
trialr
and escalation
work together to
achieve these goals. trialr
provides model-fitting
capabilities to escalation
, including the CRM methods
described here. escalation
then provides additional classes
to achieve all of the above custom behaviours, and more.
escalation
also provides methods for running simulations
and calculating dose-paths. Simulations are regularly used to appraise
the operating characteristics of adaptive clinical trial designs.
Dose-paths are a tool for analysing and visualising all possible future
trial behaviours. Both are provided for a wide array of dose-finding
designs, with or without custom behaviours like those identified above.
There are many examples in the escalation
vignettes at https://cran.r-project.org/package=escalation.
trialr
is available at https://github.com/brockk/trialr and https://CRAN.R-project.org/package=trialr