Properties of analysis methods that account for clustering in volume-outcome studies when the primary predictor is cluster size
Data Interpretation, Statistical
In recent years health services researchers have conducted 'volume-outcome' studies to evaluate whether providers (hospitals or surgeons) who treat many patients for a specialized condition have better outcomes than those that treat few patients. These studies and the inherent clustering of events by provider present an unusual statistical problem. The volume-outcome setting is unique in that 'volume' reflects both the primary factor under study and also the cluster size. Consequently, the assumptions inherent in the use of available methods that correct for clustering might be violated in this setting. To address this issue, we investigate via simulation the properties of three estimation procedures for the analysis of cluster correlated data, specifically in the context of volume-outcome studies. We examine and compare the validity and efficiency of widely-available statistical techniques that have been used in the context of volume-outcome studies: generalized estimating equations (GEE) using both the independence and exchangeable correlation structures; random effects models; and the weighted GEE approach proposed by Williamson et al. (Biometrics 2003; 59:36-42) to account for informative clustering. Using data generated either from an underlying true random effects model or a cluster correlated model we show that both the random effects and the GEE with an exchangeable correlation structure have generally good properties, with relatively low bias for estimating the volume parameter and its variance. By contrast, the cluster weighted GEE method is inefficient.