Coefficients Covariance Matrix Adjustment

Here we describe the variance-covariance matrix adjustment of coefficients.

Introduction

To estimate the covariance matrix of coefficients, there are many ways. In mmrm package, we implemented asymptotic, empirical, Jackknife and Kenward-Roger methods. For simplicity, the following derivation are all for unweighted mmrm. For weighted mmrm, we can follow the details of weighted least square estimator.

Asymptotic Covariance

Asymptotic covariance are derived based on the estimate of \(\beta\).

Following the definition in details in model fitting, we have

\[ \hat\beta = (X^\top W X)^{-1} X^\top W Y \]

\[ cov(\hat\beta) = (X^\top W X)^{-1} X^\top W cov(\epsilon) W X (X^\top W X)^{-1} = (X^\top W X)^{-1} \]

Where \(W\) is the block diagonal matrix of inverse of covariance matrix of \(\epsilon\).

Empirical Covariance

Empirical covariance, also known as the robust sandwich estimator, or “CR0”, is derived by replacing the covariance matrix of \(\epsilon\) by observed covariance matrix.

\[ cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top W_i \hat\epsilon_i\hat\epsilon_i^\top W_i X_i})(X^\top W X)^{-1} = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} L_{i}^\top X_i})(X^\top W X)^{-1} \]

Where \(W_i\) is the block diagonal part for subject \(i\) of \(W\) matrix, \(\hat\epsilon_i\) is the observed residuals for subject i, \(L_i\) is the Cholesky factor of \(\Sigma_i^{-1}\) (\(W_i = L_i L_i^\top\)).

See the detailed explanation of these formulas in the Weighted Least Square Empirical Covariance vignette.

Jackknife Covariance

Jackknife method in mmrm is the “leave-one-cluster-out” method. It is also known as “CR3”. Following McCaffrey and Bell (2003), we have

\[ cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top X_i})(X^\top W X)^{-1} \]

where

\[H_{ii} = X_i(X^\top X)^{-1}X_i^\top\]

Please note that in the paper there is an additional scale parameter \(\frac{n-1}{n}\) where \(n\) is the number of subjects, here we do not include this parameter.

Bias-Reduced Covariance

Bias-reduced method, also known as “CR2”, provides unbiased under correct working model. Following McCaffrey and Bell (2003), we have \[ cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} (I_{i} - H_{ii})^{-1/2} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top X_i})(X^\top W X)^{-1} \]

where

\[H_{ii} = X_i(X^\top X)^{-1}X_i^\top\]

Kenward-Roger Covariance

Kenward-Roger covariance is an adjusted covariance matrix for small sample size. Details can be found in Kenward-Roger

McCaffrey, Daniel F, and Robert M Bell. 2003. “Bias Reduction in Standard Errors for Linear Regression with Multi-Stage Samples.” Quality Control and Applied Statistics 48 (6): 677–82.