This vignette describes a scoring method similar to Mogg and Bradley (1999); difference of mean reaction times (RTs) between conditions with probe-at-test and probe-at-control, for correct responses, after removing RTs below 200 ms and above 520 ms, on Visual Probe Task data.
Load the included VPT dataset and inspect its documentation.
data("ds_vpt", package = "splithalfr")
?ds_vpt
The columns used in this example are:
UserID
, which identifies participantsblock_type
, in order to select assessment blocks
onlypatt
, in order to compare trials in which the probe is
at the test or at the control stimulusresponse
, in order to select correct responses
onlyrt
, in order to drop RTs outside of the range [200,
520] and calculate means per level of pattthor
, which is the horizontal position of test
stimuluskeep
, which is whether probe was superimposed on the
stimuli or replaced stimuliOnly select trials from assessment blocks
ds_vpt <- subset(ds_vpt, block_type == "assess")
The variables patt
, thor
, and
keep
were counterbalanced. Below we illustrate this for the
first participant.
ds_1 <- subset(ds_vpt, UserID == 1)
table(ds_1$patt, ds_1$thor, ds_1$keep)
The scoring function calculates the score of a single participant as follows:
fn_score <- function (ds) {
ds_keep <- ds[ds$response == 1 & ds$rt >= 200 & ds$rt <= 520, ]
rt_yes <- mean(ds_keep[ds_keep$patt == "yes", ]$rt)
rt_no <- mean(ds_keep[ds_keep$patt == "no", ]$rt)
return (rt_no - rt_yes)
}
Let’s calculate the VPT score for the participant with UserID 23. NB - This score has also been calculated manually via Excel in the splithalfr repository.
fn_score(subset(ds_vpt, UserID == 23))
To calculate the VPT score for each participant, we will use R’s
native by
function and convert the result to a data
frame.
scores <- by(
ds_vpt,
ds_vpt$UserID,
fn_score
)
data.frame(
UserID = names(scores),
score = as.vector(scores)
)
To calculate split-half scores for each participant, use the function
by_split
. The first three arguments of this function are
the same as for by
. An additional set of arguments allow
you to specify how to split the data and how often. In this vignette we
will calculate scores of 1000 permutated splits. The trial properties
patt
, thor
and keep
were
counterbalanced in the VPT design. We will stratify splits by these
trial properties. See the vignette on splitting methods for more ways to
split the data.
The by_split
function returns a data frame with the
following columns:
participant
, which identifies participantsreplication
, which counts replicationsscore_1
and score_2
, which are the scores
calculated for each of the split datasetsCalculating the split scores may take a while. By default,
by_split
uses all available CPU cores, but no progress bar
is displayed. Setting ncores = 1
will display a progress
bar, but processing will be slower.
split_scores <- by_split(
ds_vpt,
ds_vpt$UserID,
fn_score,
replications = 1000,
stratification = paste(ds_vpt$patt, ds_vpt$thor, ds_vpt$keep)
)
Next, the output of by_split
can be analyzed in order to
estimate reliability. By default, functions are provided that calculate
Spearman-Brown adjusted Pearson correlations
(spearman_brown
), Flanagan-Rulon
(flanagan_rulon
), Angoff-Feldt (angoff_feldt
),
and Intraclass Correlation (short_icc
) coefficients. Each
of these coefficient functions can be used with split_coef
to calculate the corresponding coefficients per split, which can then be
plotted or averaged via a simple mean
. A bias-corrected and
accelerated bootstrap confidence interval can be calculated via
split_ci
. Note that estimating the confidence interval
involves very intensive calculations, so it can take a long time to
complete.
# Spearman-Brown adjusted Pearson correlations per replication
coefs <- split_coefs(split_scores, spearman_brown)
# Distribution of coefficients
hist(coefs)
# Mean of coefficients
mean(coefs)
# Confidence interval of coefficients
split_ci(split_scores, spearman_brown)