% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/simulateCRT.R
\name{simulateCRT}
\alias{simulateCRT}
\title{Simulation of cluster randomized trial with spillover}
\usage{
simulateCRT(
  trial = NULL,
  effect = 0,
  outcome0 = NULL,
  generateBaseline = TRUE,
  matchedPair = TRUE,
  scale = "proportion",
  baselineNumerator = "base_num",
  baselineDenominator = "base_denom",
  denominator = NULL,
  ICC_inp = NULL,
  kernels = 200,
  sigma_m = NULL,
  spillover_interval = NULL,
  tol = 0.005
)
}
\arguments{
\item{trial}{an object of class \code{"CRTsp"} or a data frame containing locations in (x,y) coordinates, cluster
assignments (factor \code{cluster}), and arm assignments (factor \code{arm}). Each location may also be
assigned a \code{propensity} (see details).}

\item{effect}{numeric. The simulated effect size (defaults to 0)}

\item{outcome0}{numeric. The anticipated value of the outcome in the absence of intervention}

\item{generateBaseline}{logical. If \code{TRUE} then baseline data and the \code{propensity} will be simulated}

\item{matchedPair}{logical. If \code{TRUE} then the function tries to carry out randomization
using pair-matching on the baseline data (see details)}

\item{scale}{measurement scale of the outcome. Options are: 'proportion' (the default); 'count'; 'continuous'.}

\item{baselineNumerator}{optional name of numerator variable for pre-existing baseline data}

\item{baselineDenominator}{optional name of denominator variable for pre-existing baseline data}

\item{denominator}{optional name of denominator variable for the outcome}

\item{ICC_inp}{numeric. Target intra cluster correlation, provided as input when baseline data are to be simulated}

\item{kernels}{number of kernels used to generate a de novo \code{propensity}}

\item{sigma_m}{numeric. standard deviation of the normal kernel measuring spatial smoothing leading to spillover}

\item{spillover_interval}{numeric. input spillover interval}

\item{tol}{numeric. tolerance of output ICC}
}
\value{
A list of class \code{"CRTsp"} containing the following components:
\tabular{lll}{
\code{geom_full}\tab list: \tab summary statistics describing the site
cluster assignments, and randomization \cr
\code{design}\tab list: \tab values of input parameters to the design \cr
\code{trial} \tab data frame: \tab rows correspond to geolocated points, as follows:\cr
\tab \code{x} \tab numeric vector:  x-coordinates of locations \cr
\tab \code{y} \tab numeric vector:  y-coordinates of locations \cr
\tab\code{cluster} \tab factor:  assignments to cluster of each location  \cr
\tab\code{arm} \tab factor:  assignments to \code{control} or \code{intervention} for each location \cr
\tab\code{nearestDiscord} \tab numeric vector:  signed Euclidean distance to nearest discordant location (km) \cr
\tab\code{propensity} \tab numeric vector:  propensity for each location \cr
\tab\code{base_denom} \tab numeric vector:  denominator for baseline \cr
\tab\code{base_num} \tab numeric vector:  numerator for baseline \cr
\tab\code{denom} \tab numeric vector:  denominator for the outcome \cr
\tab\code{num} \tab numeric vector:  numerator for the outcome \cr
\tab\code{...} \tab other objects included in the input \code{"CRTsp"} object
or \code{data.frame}\cr
}
}
\description{
\code{simulateCRT} generates simulated data for a cluster randomized trial (CRT) with geographic spillover between arms.
}
\details{
Synthetic data are generated by sampling around the values of
variable \code{propensity}, which is a numerical vector
(taking positive values) of length equal to the number of locations.
There are three ways in which \code{propensity} can arise:
\enumerate{
\item \code{propensity} can be provided as part of the input \code{trial} object.
\item Baseline numerators and denominators (values of \code{baselineNumerator}
and \code{baselineDenominator} may be provided.
\code{propensity} is then generated as the numerator:denominator ratio
for each location in the input object
\item Otherwise \code{propensity} is generated using a 2D Normal
kernel density. The \href{https://rdrr.io/cran/OOR/man/StoSOO.html}{\code{OOR::StoSOO}}
is used to achieve an intra-cluster correlation coefficient (ICC) that approximates
the value of \code{'ICC_inp'} by searching for an appropriate value of the kernel bandwidth.
}
\code{num[i]}, the synthetic outcome for location \code{i}
is simulated with expectation: \cr
\deqn{E(num[i]) = outcome0[i] * propensity[i] * denom[i] * (1 - effect*I[i])/mean(outcome0[] * propensity[])} \cr
The sampling distribution of \code{num[i]} depends on the value of \code{scale} as follows: \cr
\itemize{
\item \code{scale}=’continuous’: Values of \code{num} are sampled from a
Normal distributions with means \code{E(num[i])}
and variance determined by the fitting to \code{ICC_inp}.\cr
\item \code{scale}=’count’: Simulated events are allocated to locations via multivariate hypergeometric distributions
parameterised with \code{E(num[i])}.\cr
\item \code{scale}=’proportion’: Simulated events are allocated to locations via multinomial distributions
parameterised with \code{E(num[i])}.\cr
}

\code{denominator} may specify a vector of numeric (non-zero) values
in the input \code{"CRTsp"} or \code{data.frame} which is returned
as variable \code{denom}. It acts as a scale-factor for continuous outcomes, rate-multiplier
for counts, or denominator for proportions. For discrete data all values of \code{denom}
must be > 0.5 and are rounded to the nearest integer in calculations of \code{num}.\cr\cr
By default, \code{denom} is generated as a vector of ones, leading to simulation of
dichotomous outcomes if \code{scale}=’proportion’.\cr

If baseline numerators and denominators are provided then the output vectors
\code{base_denom} and  \code{base_num} are set to the input values. If baseline numerators and denominators
are not provided then the synthetic baseline data are generated by sampling around \code{propensity} in the same
way as the outcome data, but with the effect size set to zero.

If \code{matchedPair} is \code{TRUE} then pair-matching on the baseline data will be used in randomization providing
there are an even number of clusters. If there are an odd number of clusters then matched pairs are not generated and
an unmatched randomization is output.

Either \code{sigma_m} or \code{spillover_interval} must be provided. If both are provided then
the value of \code{sigma_m} is overwritten
by the standard deviation implicit in the value of \code{spillover_interval}.
Spillover is simulated as arising from a diffusion-like process.

For further details see \href{https://edoc.unibas.ch/85228/}{Multerer (2021)}
}
\examples{
{smalltrial <- readdata('smalltrial.csv')
 simulation <- simulateCRT(smalltrial,
  effect = 0.25,
  ICC_inp = 0.05,
  outcome0 = 0.5,
  matchedPair = FALSE,
  scale = 'proportion',
  sigma_m = 0.6,
  tol = 0.05)
 summary(simulation)
 }
}
