- This event has passed.
A talk by Awan Afiaz on 18 September 2024
September 18 @ 10:00 am - 11:30 am
Title: Optimal Sandwich Variance Estimator in Penalized GEE for Nearly Separated Longitudinal Binary Data with Small Samples
Venue and time: ISRT, 10:00 am
Speaker: Awan Afiaz, PhD candidate at the Department of Biostatistics, University of Washington Seattle, WA, USA and ISRT alumnus
Abstract:
Data separation arises in both independent and correlated binary data in biomedical studies and poses a substantial challenge that can lead to unreliable estimates and misleading inferences. This problem can occur due to a small sample size, a rare exposure or event, a very strong predictor or a linear combination of predictors, high within-subject correlation (ICC), or any combination of these issues. Penalized generalized estimating equations (GEE) have been shown to be the superior approach for handling separation in binary longitudinal data, along with bias-corrected sandwich variance estimators. Although the sandwich variance estimator is valid under misspecification of the working correlation structure in GEE, it is downward biased by design for small samples and requires large samples for the asymptotic advantages to take effect. This has led to the development of several modified robust variance estimators for GEE for small samples, which motivates finding the optimal sandwich estimator in the context of penalized GEE when there is near separation (sparsity) in the data. The current study proposed a bias-corrected sandwich variance estimator for penalized GEE and compared its performance with ten extant sandwich estimators for nearly separated data using a simulation study. To motivate the need for an optimal sandwich estimator in penalized GEE, we demonstrated that the existing small-sample based estimators provided contradictory results when using dermatophyte-toe onychomycosis trial data. The proposed sandwich estimator does not require any additional assumptions beyond those already employed by the original sandwich estimator for GEE. We evaluated the proposed sandwich estimator by assessing the ratio of the average SEs and the empirical SD and by calculating the type-I error rates for Wald tests of the regression coefficients. Our simulation studies showed that the proposed estimator yielded nominal-level type-I error rates based on Wald tests of regression coefficients, regardless of whether the working correlation model was correctly specified. Furthermore, while existing approaches performed well when the number of subjects was high, the proposed estimator achieved nominal type-I error rates with sample sizes as low as 10, even in the most extreme scenarios. Even though all existing sandwich estimators performed better as the number of subjects increased, exhibiting the usual asymptotic behavior of sandwich estimators, no other estimator uniformly achieved optimal performance faster (with respect to the number of subjects and ICC) than our proposed estimator