| Version: | 0.2.6 |
| Title: | Sample Size and Simulation for Negative Binomial Outcomes |
| Description: | Provides tools for planning and simulating recurrent event trials with overdispersed count endpoints analyzed using negative binomial (or Poisson) rate models. Implements sample size and power calculations for fixed designs with variable accrual, dropout, maximum follow-up, and event gaps, including methods of Zhu and Lakkis (2014) <doi:10.1002/sim.5947> and Friede and Schmidli (2010) <doi:10.3414/ME09-02-0060>. Supports group sequential designs by adding calendar-time analysis schedules compatible with the 'gsDesign' package and by estimating blinded information at interim looks. Includes simulation utilities for recurrent events (including seasonal rates), interim data truncation, and Wald-based inference for treatment rate ratios. |
| License: | GPL (≥ 3) |
| URL: | https://keaven.github.io/gsDesignNB/, https://github.com/keaven/gsDesignNB |
| BugReports: | https://github.com/keaven/gsDesignNB/issues |
| Encoding: | UTF-8 |
| Depends: | R (≥ 3.5.0) |
| Imports: | data.table, gsDesign, simtrial, stats, MASS |
| Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown, ggplot2, dplyr, gt, scales |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | no |
| Packaged: | 2026-02-12 01:18:28 UTC; Anderkea |
| Author: | Keaven Anderson [aut, cre],
Nan Xiao [ctb],
Hongtao Zhang [ctb],
Merck & Co., Inc., Rahway, NJ, USA and its affiliates
|
| Maintainer: | Keaven Anderson <keaven_anderson@merck.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-16 17:40:09 UTC |
Blinded sample size re-estimation for recurrent events
Description
Estimates the blinded dispersion and event rate from aggregated interim data and calculates the required sample size to maintain power, assuming the planned treatment effect holds. This function supports constant rates (Friede & Schmidli 2010) and accommodates future extensions for time-varying rates (Schneider et al. 2013) by using the exposure-adjusted rate.
Usage
blinded_ssr(
data,
ratio = 1,
lambda1_planning,
lambda2_planning,
rr0 = 1,
power = 0.8,
alpha = 0.025,
method = "friede",
accrual_rate,
accrual_duration,
trial_duration,
dropout_rate = 0,
max_followup = NULL,
event_gap = NULL
)
Arguments
data |
A data frame containing the blinded interim data. Must include
columns |
ratio |
Planned allocation ratio (experimental / control). Default is 1. |
lambda1_planning |
Planned event rate for the control group used in original calculation. |
lambda2_planning |
Planned event rate for the experimental group used in original calculation. |
rr0 |
Rate ratio under the null hypothesis (lambda2/lambda1). Default is 1. |
power |
Target power (1 - beta). Default is 0.8. |
alpha |
One-sided significance level. Default is 0.025. |
method |
Method for sample size recalculation. Currently "friede" (Friede & Schmidli 2010) is implemented, which uses the blinded nuisance parameter estimates. |
accrual_rate |
Vector of accrual rates (patients per unit time). |
accrual_duration |
Vector of durations for each accrual rate. Must be same length
as |
trial_duration |
Total planned duration of the trial. |
dropout_rate |
Dropout rate (hazard rate). Default is 0. |
max_followup |
Maximum follow-up time for any patient. Default is NULL (infinite). |
event_gap |
Gap duration after each event during which no new events are counted. Default is NULL (no gap). |
Value
A list containing:
- n_total_blinded
Re-estimated total sample size using blinded estimates.
- dispersion_blinded
Estimated dispersion parameter (k) from blinded data.
- lambda_blinded
Estimated overall event rate from blinded data.
- info_fraction
Estimated information fraction at interim (blinded information / target information).
- blinded_info
Estimated statistical information from the blinded interim data.
- target_info
Target statistical information required for the planned power.
References
Friede, T., & Schmidli, H. (2010). Blinded sample size reestimation with count data: methods and applications in multiple sclerosis. Statistics in Medicine, 29(10), 1145–1156. doi:10.1002/sim.3861
Schneider, S., Schmidli, H., & Friede, T. (2013). Blinded sample size re-estimation for recurrent event data with time trends. Statistics in Medicine, 32(30), 5448–5457. doi:10.1002/sim.5977
Examples
interim <- data.frame(events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9))
blinded_ssr(
interim,
ratio = 1,
lambda1_planning = 0.5,
lambda2_planning = 0.3,
power = 0.8,
alpha = 0.025,
accrual_rate = 10,
accrual_duration = 12,
trial_duration = 18
)
Calculate blinded statistical information
Description
Estimates the blinded dispersion and event rate from aggregated interim data and calculates the observed statistical information for the log rate ratio, assuming the planned allocation ratio and treatment effect.
Usage
calculate_blinded_info(
data,
ratio = 1,
lambda1_planning,
lambda2_planning,
event_gap = NULL
)
Arguments
data |
A data frame containing the blinded interim data. Must include
columns |
ratio |
Planned allocation ratio (experimental / control). Default is 1. |
lambda1_planning |
Planned event rate for the control group. |
lambda2_planning |
Planned event rate for the experimental group. |
event_gap |
Optional. Gap duration (numeric) to adjust planning rates if provided. If provided, planning rates are adjusted as lambda / (1 + lambda * gap). |
Value
A list containing:
- blinded_info
Estimated statistical information.
- dispersion_blinded
Estimated dispersion parameter (k).
- lambda_blinded
Estimated overall event rate.
- lambda1_adjusted
Re-estimated control rate.
- lambda2_adjusted
Re-estimated experimental rate.
Examples
interim <- data.frame(events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9))
calculate_blinded_info(
interim,
ratio = 1,
lambda1_planning = 0.5,
lambda2_planning = 0.3
)
Check group sequential bounds
Description
Updates the group sequential design boundaries based on observed information and checks if boundaries have been crossed.
Usage
check_gs_bound(sim_results, design, info_scale = c("blinded", "unblinded"))
Arguments
sim_results |
Data frame of simulation results (from |
design |
The planning |
info_scale |
Character. "blinded" (default) or "unblinded" information to use for bounds. |
Value
A data frame with added columns:
- cross_upper
Logical, true if upper bound crossed (efficacy)
- cross_lower
Logical, true if lower bound crossed (futility)
Examples
design <- gsDesign::gsDesign(k = 2, n.fix = 100, test.type = 2, timing = c(0.5, 1))
sim_df <- data.frame(
sim = c(1, 1, 2, 2),
analysis = c(1, 2, 1, 2),
z_stat = c(2.5, NA, -0.2, 2.2),
blinded_info = c(50, 100, 50, 100),
unblinded_info = c(50, 100, 50, 100)
)
check_gs_bound(sim_df, design)
Compute statistical information at analysis time
Description
Computes the statistical information for the log rate ratio at a given analysis time, accounting for staggered enrollment and varying exposure times.
Usage
compute_info_at_time(
analysis_time,
accrual_rate,
accrual_duration,
lambda1,
lambda2,
dispersion,
ratio = 1,
dropout_rate = 0,
event_gap = 0,
max_followup = Inf
)
Arguments
analysis_time |
The calendar time of the analysis. |
accrual_rate |
The enrollment rate (subjects per time unit). |
accrual_duration |
The duration of the enrollment period. |
lambda1 |
Event rate for group 1 (control). |
lambda2 |
Event rate for group 2 (treatment). |
dispersion |
The negative binomial dispersion parameter. |
ratio |
Allocation ratio (n2/n1). Default is 1. |
dropout_rate |
Dropout rate (hazard rate). Default is 0. |
event_gap |
Gap duration after each event during which no new events are counted. Default is 0. |
max_followup |
Maximum follow-up time per subject. Exposure time is
truncated at this value. Default is |
Value
The statistical information (inverse of variance) at the analysis time.
Examples
compute_info_at_time(
analysis_time = 12,
accrual_rate = 10,
accrual_duration = 10,
lambda1 = 0.5,
lambda2 = 0.3,
dispersion = 0.1
)
Cut data for completers analysis
Description
Subsets the data to all subjects randomized by the specified date,
and prepares the data for analysis. This is a wrapper for cut_data_by_date()
typically used with a date determined by cut_date_for_completers().
Usage
cut_completers(data, cut_date, event_gap = 0)
Arguments
data |
Data generated by |
cut_date |
Calendar time (relative to trial start) at which to cut the data. |
event_gap |
Gap duration after each event during which no new events are counted.
Can be a numeric value (default |
Value
A data frame with one row per subject randomized prior to cut_date.
Contains the truncated follow-up time (tte) and total number of observed events (events).
Examples
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12)
fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3))
dropout_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.1, 0.05), duration = c(100, 100)
)
sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20)
# Find date when 5 subjects have completed
date_5 <- cut_date_for_completers(sim, 5)
# Get analysis dataset for this cut date (includes partial follow-up)
cut_completers(sim, date_5)
Cut simulated trial data at a calendar date
Description
Censors follow-up at a specified calendar time and aggregates events per subject. Returns one row per subject randomized before the cut date, with the total number of observed events and follow-up times.
Usage
cut_data_by_date(data, cut_date, event_gap = 0, ...)
## Default S3 method:
cut_data_by_date(data, cut_date, event_gap = 0, ...)
## S3 method for class 'nb_sim_data'
cut_data_by_date(data, cut_date, event_gap = 0, ...)
## S3 method for class 'nb_sim_seasonal'
cut_data_by_date(data, cut_date, event_gap = 0, ...)
Arguments
data |
Data generated by |
cut_date |
Calendar time (relative to trial start) at which to censor follow-up. |
event_gap |
Gap duration after each event during which no new events are counted.
Can be a numeric value (default |
... |
Additional arguments passed to methods. |
Value
A data frame with one row per subject randomized prior to cut_date containing:
- id
Subject identifier
- treatment
Treatment group
- enroll_time
Time of enrollment relative to trial start
- tte
Time at risk (total follow-up minus event gap periods)
- tte_total
Total follow-up time (calendar time, not adjusted for gaps)
- events
Number of observed events
A data frame with one row per subject randomized prior to cut_date.
This method stops with an error for unsupported classes.
A data frame with one row per subject randomized prior to cut_date.
Includes total events and follow-up time within the cut window.
A data frame with one row per subject randomized prior to cut_date.
Includes season and follow-up time within the cut window.
Methods (by class)
-
cut_data_by_date(default): Default method. -
cut_data_by_date(nb_sim_data): Method fornb_simdata. -
cut_data_by_date(nb_sim_seasonal): Method fornb_sim_seasonaldata.
Examples
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12)
fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3))
dropout_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.1, 0.05), duration = c(100, 100)
)
sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20)
cut_data_by_date(sim, cut_date = 1)
Find calendar date for target completer count
Description
Finds the calendar time (since start of randomization) at which a specified number of subjects have completed their follow-up.
Usage
cut_date_for_completers(data, target_completers)
Arguments
data |
A data frame of simulated data, typically from |
target_completers |
Integer. The target number of completers. |
Value
Numeric. The calendar date when target_completers is achieved.
If the dataset contains fewer than target_completers completers, returns the maximum
calendar time in the dataset and prints a message.
Examples
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12)
fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3))
dropout_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.1, 0.05), duration = c(100, 100)
)
sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20)
cut_date_for_completers(sim, target_completers = 5)
Method of Moments Estimation for Negative Binomial Parameters
Description
Estimates the event rate(s) and common dispersion parameter (k) for negative binomial count data using the method of moments. This is a robust alternative to Maximum Likelihood Estimation (MLE), especially when MLE fails to converge or produces boundary estimates.
Usage
estimate_nb_mom(data, group = NULL)
Arguments
data |
A data frame containing the data. Must include columns
|
group |
Optional character string specifying the grouping column name (e.g., "treatment"). If provided, rates are estimated separately for each group, while a common dispersion parameter is estimated across groups. If NULL (default), a single rate and dispersion are estimated (blinded case). |
Details
The method of moments estimator for the dispersion parameter k is derived
by equating the theoretical variance to the observed second central moment,
accounting for varying exposure times.
For a given group with rate \lambda, the expected count for subject i
is \mu_i = \lambda t_i. The variance is V_i = \mu_i + k \mu_i^2.
The estimator is calculated as:
\hat{k} = \max\left(0, \frac{\sum (y_i - \hat{\mu}_i)^2 - \sum y_i}{\sum \hat{\mu}_i^2}\right)
where y_i is the number of events, t_i is the exposure time,
and \hat{\mu}_i = \hat{\lambda} t_i is the estimated expected count.
When multiple groups are present, the numerator and denominator are summed
across all groups to estimate a common k.
Value
A list containing:
lambda |
Estimated event rate(s). A single numeric value if |
dispersion |
Estimated common dispersion parameter (k). |
Examples
# Blinded estimation (single group)
df <- data.frame(events = c(1, 2, 0, 3), tte = c(1, 1.2, 0.5, 1.5))
estimate_nb_mom(df)
# Unblinded estimation (two groups)
df_group <- df
df_group$group <- c("A", "A", "B", "B")
estimate_nb_mom(df_group, group = "group")
Find calendar date for target event count
Description
Finds the calendar time (since start of randomization) at which a specified total number of events is reached in the simulated dataset.
Usage
get_analysis_date(data, planned_events, event_gap = 5/365.25)
Arguments
data |
A data frame of simulated data, typically from |
planned_events |
Integer. The target number of events. |
event_gap |
Gap duration after each event during which no new events are counted.
Can be a numeric value (default |
Value
Numeric. The calendar date when planned_events is achieved.
If the dataset contains fewer than planned_events, returns the maximum
calendar time in the dataset and prints a message.
Examples
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12)
fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3))
dropout_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.1, 0.05), duration = c(100, 100)
)
sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 40)
get_analysis_date(sim, planned_events = 15)
Determine analysis date based on criteria
Description
Finds the earliest calendar date at which all specified criteria are met. Criteria can include a specific calendar date, a target number of events, a target number of completers, or a target amount of blinded information.
Usage
get_cut_date(
data,
planned_calendar = NULL,
target_events = NULL,
target_completers = NULL,
target_info = NULL,
event_gap = 0,
ratio = 1,
lambda1 = NULL,
lambda2 = NULL,
min_date = 0,
max_date = Inf
)
Arguments
data |
A data frame of simulated data (from |
planned_calendar |
Numeric. Target calendar time. |
target_events |
Integer. Target number of observed events. |
target_completers |
Integer. Target number of subjects with complete follow-up. |
target_info |
Numeric. Target blinded information. |
event_gap |
Numeric. Gap duration for event counting and info calculation. |
ratio |
Numeric. Randomization ratio (experimental/control) for info calculation. |
lambda1 |
Numeric. Planned control rate for info calculation. |
lambda2 |
Numeric. Planned experimental rate for info calculation. |
min_date |
Numeric. Minimum possible date (e.g., 0 or previous analysis time). |
max_date |
Numeric. Maximum possible date (e.g., trial duration). |
Value
Numeric. The calendar date satisfying the criteria. If criteria cannot be met
within max_date (or data limits), returns max_date (or max data time).
Examples
set.seed(456)
enroll_rate <- data.frame(rate = 15, duration = 1)
fail_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.6, 0.4)
)
sim_data <- nb_sim(enroll_rate, fail_rate, max_followup = 1, n = 20)
get_cut_date(sim_data, planned_calendar = 0.5, target_events = 5, event_gap = 0)
Group sequential design for negative binomial outcomes
Description
Creates a group sequential design for negative binomial outcomes based on
sample size calculations from sample_size_nbinom().
Usage
gsNBCalendar(
x,
k = 3,
test.type = 4,
alpha = 0.025,
beta = 0.1,
astar = 0,
delta = 0,
sfu = gsDesign::sfHSD,
sfupar = -4,
sfl = gsDesign::sfHSD,
sflpar = -2,
tol = 1e-06,
r = 18,
usTime = NULL,
lsTime = NULL,
analysis_times = NULL
)
Arguments
x |
An object of class |
k |
Number of analyses (interim + final). Default is 3. |
test.type |
Test type as in
Default is 4. |
alpha |
Type I error (one-sided). Default is 0.025. |
beta |
Type II error (1 - power). Default is 0.1. |
astar |
Allocated Type I error for lower bound for test.type = 5 or 6. Default is 0. |
delta |
Standardized effect size. Default is 0 (computed from design). |
sfu |
Spending function for upper bound. Default is |
sfupar |
Parameter for upper spending function. Default is -4. |
sfl |
Spending function for lower bound. Default is |
sflpar |
Parameter for lower spending function. Default is -2. |
tol |
Tolerance for convergence. Default is 1e-06. |
r |
Integer controlling grid size for numerical integration. Default is 18. |
usTime |
Spending time for upper bound (optional). |
lsTime |
Spending time for lower bound (optional). |
analysis_times |
Vector of calendar times for each analysis.
Must have length k. These times are stored in the |
Value
An object of class gsNB which inherits from gsDesign
and sample_size_nbinom_result.
While the final sample size would be planned total enrollment, interim analysis
sample sizes are the expected number enrolled at the times specified in analysis_times.
Output value contains all elements from
gsDesign::gsDesign() plus:
- nb_design
The original
sample_size_nbinom_resultobject- n1
A vector with sample size per analysis for group 1
- n2
A vector with sample size per analysis for group 2
- n_total
A vector with total sample size per analysis
- events
A vector with expected total events per analysis
- events1
A vector with expected events per analysis for group 1
- events2
A vector with expected events per analysis for group 2
- exposure
A vector with expected average calendar exposure per analysis
- exposure_at_risk1
A vector with expected at-risk exposure per analysis for group 1
- exposure_at_risk2
A vector with expected at-risk exposure per analysis for group 2
- variance
A vector with variance of log rate ratio per analysis
- T
Calendar time at each analysis (if
analysis_timesprovided)
Note that n.I in the returned object represents the statistical information
at each analysis.
References
Jennison, C. and Turnbull, B.W. (2000), Group Sequential Methods with Applications to Clinical Trials. Boca Raton: Chapman and Hall.
Examples
# First create a sample size calculation
nb_ss <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
# Then create a group sequential design with analysis times
gs_design <- gsNBCalendar(nb_ss,
k = 3, test.type = 4,
analysis_times = c(10, 18, 24)
)
Wald test for treatment effect using negative binomial model (Mutze et al.)
Description
Fits a negative binomial (or Poisson) log-rate model to the aggregated
subject-level data produced by cut_data_by_date(). The method matches the
Wald test described by Mutze et al. (2019) for comparing treatment arms with
recurrent event outcomes.
Usage
mutze_test(
data,
method = c("nb", "poisson"),
conf_level = 0.95,
sided = 1,
poisson_threshold = 1000
)
## S3 method for class 'mutze_test'
print(x, ...)
Arguments
data |
A data frame with at least the columns |
method |
Type of model to fit: "nb" (default) uses a negative binomial
GLM via |
conf_level |
Confidence level for the rate ratio interval. Default 0.95. |
sided |
Number of sides for the test: 1 (default) or 2. |
poisson_threshold |
When |
x |
An object of class |
... |
Additional arguments (currently ignored). |
Value
An object of class mutze_test containing the fitted model summary with elements:
-
method: A string indicating the test method used. -
estimate: log rate ratio (experimental vs control). -
se: standard error for the log rate ratio. -
z: Wald statistic. -
p_value: one-sided or two-sided p-value. -
rate_ratio: estimated rate ratio and its confidence interval. -
dispersion: estimated dispersion (theta) whenmethod = "nb". -
group_summary: observed subjects/events/exposure per treatment.
Invisibly returns the input object.
Methods (by generic)
-
print(mutze_test): Print method formutze_testobjects.
Examples
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12)
fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3))
dropout_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.1, 0.05), duration = c(100, 100)
)
sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 40)
cut <- cut_data_by_date(sim, cut_date = 1.5)
mutze_test(cut)
Simulate recurrent events with fixed follow-up
Description
Simulates recurrent events for a clinical trial with piecewise constant enrollment, exponential failure rates (Poisson process), and piecewise exponential dropout.
Usage
nb_sim(
enroll_rate,
fail_rate,
dropout_rate = NULL,
max_followup = NULL,
n = NULL,
block = c(rep("Control", 2), rep("Experimental", 2)),
event_gap = 0
)
Arguments
enroll_rate |
A data frame with columns |
fail_rate |
A data frame with columns |
dropout_rate |
A data frame with columns |
max_followup |
Numeric. Maximum duration of follow-up for each individual (relative to their randomization time). |
n |
Total sample size. If NULL, it is estimated from |
block |
Block vector for treatment allocation. Default is
|
event_gap |
Numeric. Gap duration after each event during which no new events are counted. Default is 0. |
Details
The simulation generates data consistent with the negative binomial models described by Friede and Schmidli (2010) and Mütze et al. (2019). Specifically, it simulates a Gamma-distributed frailty variable for each individual (if dispersion > 0), which acts as a multiplier for that individual's event rate. Events are then generated according to a Poisson process with this subject-specific rate.
More explicitly, for a subject with baseline rate \lambda and exposure time
t, the model used here is a Gamma–Poisson mixture:
\Lambda_i \sim \mathrm{Gamma}(\text{shape}=1/k,\ \text{scale}=k\lambda), \quad Y_i \mid \Lambda_i \sim \mathrm{Poisson}(\Lambda_i t).
Marginally, Y_i follows a negative binomial distribution with
\mathrm{E}[Y_i]=\mu=\lambda t and \mathrm{Var}(Y_i)=\mu + k\mu^2.
This k is the package dispersion parameter (and corresponds to
1/\theta in MASS::glm.nb() terminology).
Value
A data frame (tibble) with columns:
- id
Subject identifier
- treatment
Treatment group
- enroll_time
Time of enrollment relative to trial start
- tte
Time to event or censoring relative to randomization
- calendar_time
Calendar time of event or censoring (enroll_time + tte)
- event
Binary indicator: 1 for event, 0 for censoring
Multiple rows per subject are returned (one for each event, plus one for the final censoring time).
References
Friede, T., & Schmidli, H. (2010). Blinded sample size reestimation with count data: methods and applications in multiple sclerosis. Statistics in Medicine, 29(10), 1145–1156. doi:10.1002/sim.3861
Mütze, T., Glimm, E., Schmidli, H., & Friede, T. (2019). Group sequential designs for negative binomial outcomes. Statistical Methods in Medical Research, 28(8), 2326–2347. doi:10.1177/0962280218773115
Examples
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12)
fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3))
dropout_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.1, 0.05), duration = c(100, 100)
)
sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20)
head(sim)
Simulate recurrent events with seasonal rates
Description
Simulates recurrent events where event rates depend on the season.
Usage
nb_sim_seasonal(
enroll_rate,
fail_rate,
dropout_rate = NULL,
max_followup = NULL,
randomization_start_date = NULL,
n = NULL,
block = c(rep("Control", 2), rep("Experimental", 2))
)
Arguments
enroll_rate |
A data frame with columns |
fail_rate |
A data frame with columns |
dropout_rate |
A data frame with columns |
max_followup |
Numeric. Max follow-up duration (years). |
randomization_start_date |
Date. Start of randomization. |
n |
Integer. Total sample size. |
block |
Character vector for block randomization. |
Value
A data frame of class nb_sim_seasonal with columns:
id, treatment, season, enroll_time, start, end, event, calendar_start, calendar_end.
Rows represent intervals of risk or events. event=1 indicates an event at end.
event=0 indicates censoring or end of a seasonal interval at end.
Examples
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12)
fail_rate <- data.frame(
treatment = rep(c("Control", "Experimental"), each = 4),
season = rep(c("Winter", "Spring", "Summer", "Fall"), times = 2),
rate = c(0.6, 0.5, 0.4, 0.5, 0.4, 0.3, 0.2, 0.3)
)
sim <- nb_sim_seasonal(
enroll_rate = enroll_rate,
fail_rate = fail_rate,
max_followup = 1,
randomization_start_date = as.Date("2020-01-01"),
n = 20
)
head(sim)
Print method for gsNBsummary objects
Description
Print method for gsNBsummary objects
Usage
## S3 method for class 'gsNBsummary'
print(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments (currently ignored). |
Value
Invisibly returns the input object.
Examples
nb_ss <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24))
s <- summary(gs_design)
print(s)
Print method for sample_size_nbinom_result objects
Description
Prints a concise summary of the sample size calculation results.
Usage
## S3 method for class 'sample_size_nbinom_result'
print(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments (currently ignored). |
Value
Invisibly returns the input object.
Examples
x <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
print(x)
Print method for sample_size_nbinom_summary objects
Description
Print method for sample_size_nbinom_summary objects
Usage
## S3 method for class 'sample_size_nbinom_summary'
print(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments (currently ignored). |
Value
Invisibly returns the input object.
Examples
x <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
s <- summary(x)
print(s)
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
- gsDesign
gsBoundSummary,gsDesign,sfBetaDist,sfCauchy,sfExponential,sfExtremeValue,sfExtremeValue2,sfGapped,sfHSD,sfLDOF,sfLDPocock,sfLinear,sfLogistic,sfNormal,sfPoints,sfPower,sfStep,sfTDist,sfTrimmed,sfTruncated,sfXG1,sfXG2,sfXG3
Sample size calculation for negative binomial distribution
Description
Computes the sample size for comparing two treatment groups assuming a negative binomial distribution for the outcome.
Usage
sample_size_nbinom(
lambda1,
lambda2,
dispersion,
power = NULL,
alpha = 0.025,
sided = 1,
ratio = 1,
rr0 = 1,
accrual_rate,
accrual_duration,
trial_duration,
dropout_rate = 0,
max_followup = NULL,
event_gap = NULL
)
Arguments
lambda1 |
Rate in group 1 (control). |
lambda2 |
Rate in group 2 (treatment). |
dispersion |
Dispersion parameter |
power |
Power of the test (1 - beta). Default is 0.9. |
alpha |
Significance level. Default is 0.025. |
sided |
One-sided or two-sided test. 1 for one-sided, 2 for two-sided. Default is 1. |
ratio |
Allocation ratio n2/n1. Default is 1. |
rr0 |
Rate ratio under the null hypothesis (lambda2/lambda1). Default is 1 (superiority). For non-inferiority, use a value > 1 (e.g., 1.1). For super-superiority, use a value < 1 (e.g., 0.8). |
accrual_rate |
Vector of accrual rates (patients per unit time). |
accrual_duration |
Vector of durations for each accrual rate. Must be same length
as |
trial_duration |
Total planned duration of the trial. If |
dropout_rate |
Dropout rate (hazard rate). Default is 0. Can be a vector of length 2. |
max_followup |
Maximum follow-up time for any patient. Default is NULL (infinite). |
event_gap |
Gap duration after each event during which no new events are counted. Default is NULL (no gap). If provided, the effective event rate is reduced. |
Value
An object of class sample_size_nbinom_result, which is a list containing:
- inputs
Named list of the original function arguments.
- n1
Sample size for group 1
- n2
Sample size for group 2
- n_total
Total sample size
- alpha
Significance level
- sided
One-sided or two-sided test
- power
Power of the test
- exposure
Average exposure time used in calculation (calendar time). Vector of length 2.
- exposure_at_risk_n1
Average at-risk exposure time for group 1 (accounts for event gap)
- exposure_at_risk_n2
Average at-risk exposure time for group 2 (accounts for event gap)
- events_n1
Expected number of events in group 1
- events_n2
Expected number of events in group 2
- total_events
Total expected number of events
- variance
Variance of the log rate ratio
- accrual_rate
Accrual rate used in calculation
- accrual_duration
Accrual duration used in calculation
References
Zhu, H., & Lakkis, H. (2014). Sample size calculation for comparing two negative binomial rates. Statistics in Medicine, 33(3), 376–387. doi:10.1002/sim.5947
Friede, T., & Schmidli, H. (2010). Blinded sample size reestimation with negative binomial counts in superiority and non-inferiority trials. Methods of Information in Medicine, 49(06), 618–624. doi:10.3414/ME09-02-0060
Mütze, T., Glimm, E., Schmidli, H., & Friede, T. (2019). Group sequential designs for negative binomial outcomes. Statistical Methods in Medical Research, 28(8), 2326–2347. doi:10.1177/0962280218773115
See Also
vignette("sample-size-nbinom", package = "gsDesignNB")
for a detailed explanation of the methodology.
Examples
# Calculate sample size for lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1
# with fixed recruitment of 10/month for 20 months, 24 month trial duration
x <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
class(x)
summary(x)
# With piecewise accrual
# 5 patients/month for 3 months, then 10 patients/month for 3 months
# Trial ends at month 12.
x2 <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8,
accrual_rate = c(5, 10), accrual_duration = c(3, 3),
trial_duration = 12
)
summary(x2)
Simulate group sequential clinical trial for negative binomial outcomes
Description
Simulates multiple replicates of a group sequential clinical trial with negative binomial outcomes, performing interim analyses at specified calendar times.
Usage
sim_gs_nbinom(
n_sims,
enroll_rate,
fail_rate,
dropout_rate = NULL,
max_followup,
event_gap = 0,
analysis_times = NULL,
n_target = NULL,
design = NULL,
data_cut = cut_data_by_date,
cuts = NULL
)
Arguments
n_sims |
Number of simulations to run. |
enroll_rate |
Enrollment rates (data frame with |
fail_rate |
Failure rates (data frame with |
dropout_rate |
Dropout rates (data frame with |
max_followup |
Maximum follow-up time. |
event_gap |
Event gap duration. |
analysis_times |
Vector of calendar times for interim and final analyses.
Optional if |
n_target |
Total sample size to enroll (optional, if not defined by |
design |
An object of class |
data_cut |
Function to cut data for analysis. Defaults to |
cuts |
A list of cutting criteria for each analysis. Each element of the list
should be a list of arguments for |
Value
A data frame containing simulation results for each analysis of each trial. Columns include:
- sim
Simulation ID
- analysis
Analysis index
- analysis_time
Calendar time of analysis
- n_enrolled
Number of subjects enrolled
- n_ctrl
Number of subjects in control group
- n_exp
Number of subjects in experimental group
- events_total
Total events observed
- events_ctrl
Events in control group
- events_exp
Events in experimental group
- exposure_at_risk_ctrl
Exposure at risk in control group (adjusted for event gaps)
- exposure_at_risk_exp
Exposure at risk in experimental group (adjusted for event gaps)
- exposure_total_ctrl
Total exposure in control group (calendar follow-up)
- exposure_total_exp
Total exposure in experimental group (calendar follow-up)
- z_stat
Z-statistic from the Wald test (positive favors experimental if rate ratio < 1)
- estimate
Estimated log rate ratio from the model
- se
Standard error of the estimate
- method_used
Method used for inference ("nb" or "poisson")
- dispersion
Estimated dispersion parameter from the model
- blinded_info
Estimated blinded statistical information (ML)
- unblinded_info
Observed unblinded statistical information (ML)
- info_unblinded_ml
Observed unblinded statistical information (ML)
- info_blinded_ml
Estimated blinded statistical information (ML)
- info_unblinded_mom
Observed unblinded statistical information (Method of Moments)
- info_blinded_mom
Estimated blinded statistical information (Method of Moments)
Examples
set.seed(123)
enroll_rate <- data.frame(rate = 10, duration = 3)
fail_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.6, 0.4),
dispersion = 0.2
)
dropout_rate <- data.frame(
treatment = c("Control", "Experimental"),
rate = c(0.05, 0.05),
duration = c(6, 6)
)
design <- sample_size_nbinom(
lambda1 = 0.6, lambda2 = 0.4, dispersion = 0.2, power = 0.8,
accrual_rate = enroll_rate$rate, accrual_duration = enroll_rate$duration,
trial_duration = 6
)
cuts <- list(
list(planned_calendar = 2),
list(planned_calendar = 4)
)
sim_results <- sim_gs_nbinom(
n_sims = 2,
enroll_rate = enroll_rate,
fail_rate = fail_rate,
dropout_rate = dropout_rate,
max_followup = 4,
n_target = 30,
design = design,
cuts = cuts
)
head(sim_results)
Summarize group sequential simulation results
Description
Provides a summary of the operating characteristics of the group sequential design based on simulation results.
Usage
summarize_gs_sim(x)
Arguments
x |
A data frame returned by |
Value
A list containing:
- n_sim
Number of simulations
- power
Overall power (probability of crossing upper bound)
- futility
Overall futility rate (probability of crossing lower bound and not upper)
- analysis_summary
Data frame with per-analysis statistics (events, crossings)
Examples
design <- gsDesign::gsDesign(k = 2, n.fix = 80, test.type = 2, timing = c(0.5, 1))
sim_df <- data.frame(
sim = c(1, 1, 2, 2),
analysis = c(1, 2, 1, 2),
z_stat = c(2.4, NA, -0.5, 1.9),
blinded_info = c(40, 80, 40, 80),
unblinded_info = c(40, 80, 40, 80),
n_enrolled = c(30, 60, 30, 60),
events_total = c(12, 25, 10, 22)
)
bounds_checked <- check_gs_bound(sim_df, design)
summarize_gs_sim(bounds_checked)
Summary for gsNB objects
Description
Provides a textual summary of a group sequential design for negative binomial
outcomes, similar to the summary provided by gsDesign::gsDesign().
For tabular output, use gsDesign::gsBoundSummary() directly on
the gsNB object.
Usage
## S3 method for class 'gsNB'
summary(object, ...)
Arguments
object |
An object of class |
... |
Additional arguments (currently ignored). |
Value
A character string summarizing the design (invisibly). The summary is also printed to the console.
Examples
nb_ss <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24))
summary(gs_design)
# For tabular bounds summary, use gsBoundSummary() directly:
gsBoundSummary(gs_design)
Summary for sample_size_nbinom_result objects
Description
Provides a textual summary of the sample size calculation for negative binomial outcomes, similar to the summary for gsNB objects.
Usage
## S3 method for class 'sample_size_nbinom_result'
summary(object, ...)
Arguments
object |
An object of class |
... |
Additional arguments (currently ignored). |
Value
A character string summarizing the design (invisibly). The summary is also printed to the console.
Examples
x <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
class(x)
summary(x)
Convert group sequential design to integer sample sizes
Description
Generic function to round sample sizes in a group sequential design to integers.
This extends the gsDesign::toInteger() function from the gsDesign
package to work with gsNB objects.
Usage
toInteger(x, ...)
## S3 method for class 'gsDesign'
toInteger(x, ratio = x$ratio, roundUpFinal = TRUE, ...)
## S3 method for class 'gsNB'
toInteger(x, ratio = x$nb_design$inputs$ratio, roundUpFinal = TRUE, ...)
Arguments
x |
An object of class |
... |
Additional arguments passed to methods. |
ratio |
Randomization ratio (n2/n1). If an integer is provided, rounding
is done to a multiple of |
roundUpFinal |
If |
Details
This function rounds sample sizes at each analysis to integers while maintaining the randomization ratio and ensuring monotonically increasing sample sizes across analyses. Only the final analysis sample size is rounded to an integer; interim sample sizes remain as expected (non-integer) values based on the information fraction.
When analysis_times were provided to gsNBCalendar(),
the statistical information (n.I) is recomputed at each analysis
time based on the new sample size and expected exposures.
Value
An object of the same class as input with integer sample sizes.
Methods (by class)
-
toInteger(gsDesign): Method forgsDesignobjects (callsgsDesign::toInteger()). -
toInteger(gsNB): Method forgsNBobjects.Rounds sample sizes in a group sequential negative binomial design to integers, respecting the randomization ratio.
Examples
nb_ss <- sample_size_nbinom(
lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9,
accrual_rate = 10, accrual_duration = 20, trial_duration = 24
)
gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24))
gs_integer <- toInteger(gs_design)
Unblinded sample size re-estimation for recurrent events
Description
Estimates the event rates and dispersion from unblinded interim data and calculates the required sample size to maintain power, assuming the planned treatment effect holds (or using the observed control rate).
Usage
unblinded_ssr(
data,
ratio = 1,
lambda1_planning,
lambda2_planning,
rr0 = 1,
power = 0.8,
alpha = 0.025,
accrual_rate,
accrual_duration,
trial_duration,
dropout_rate = 0,
max_followup = NULL,
event_gap = NULL
)
Arguments
data |
A data frame containing the unblinded interim data. Must include
columns |
ratio |
Planned allocation ratio (experimental / control). Default is 1. |
lambda1_planning |
Planned event rate for the control group used in original calculation. |
lambda2_planning |
Planned event rate for the experimental group used in original calculation. |
rr0 |
Rate ratio under the null hypothesis (lambda2/lambda1). Default is 1. |
power |
Target power (1 - beta). Default is 0.8. |
alpha |
One-sided significance level. Default is 0.025. |
accrual_rate |
Vector of accrual rates (patients per unit time). |
accrual_duration |
Vector of durations for each accrual rate. Must be same length
as |
trial_duration |
Total planned duration of the trial. |
dropout_rate |
Dropout rate (hazard rate). Default is 0. |
max_followup |
Maximum follow-up time for any patient. Default is NULL (infinite). |
event_gap |
Gap duration after each event during which no new events are counted. Default is NULL (no gap). |
Value
A list containing:
- n_total_unblinded
Re-estimated total sample size using unblinded estimates.
- dispersion_unblinded
Estimated dispersion parameter (k) from unblinded data.
- lambda1_unblinded
Estimated control event rate from unblinded data.
- lambda2_unblinded
Estimated experimental event rate from unblinded data.
- info_fraction
Estimated information fraction at interim (unblinded information / target information).
- unblinded_info
Estimated statistical information from the unblinded interim data.
- target_info
Target statistical information required for the planned power.
Examples
interim <- data.frame(
events = c(1, 2, 1, 3),
tte = c(0.8, 1.0, 1.2, 0.9),
treatment = c("Control", "Control", "Experimental", "Experimental")
)
unblinded_ssr(
interim,
ratio = 1,
lambda1_planning = 0.5,
lambda2_planning = 0.3,
power = 0.8,
alpha = 0.025,
accrual_rate = 10,
accrual_duration = 12,
trial_duration = 18
)