A Simulation Approach for the Calculation
of Statistical Power in Longitudinal Experimental Designs that
Include Missing Values
Simcha Pollack and Robert
Fireworker, Department of Computer Information
Systems/Decision Sciences, The Peter J. Tobin College of
Business
Leonard Presby, William Patterson
University
Abstract: Designing an experiment
in almost any area of research necessitates a power analysis for
sample size determination. Results from the power analysis
enable researchers to plan for the proper sample size so that, if
the alternative hypothesis is true, they would have a high
probability of reporting statistically significant
findings. Many computer programs and formulas exist for
calculating power when the research design and statistical analysis
is relatively simple. These include independent and paired
t-tests, one way analysis of variance and multiple
regression. When the designs become more complex it is
difficult or impossible to do a proper power analysis with the
available tools. One example of this complex design is the
longitudinal study with missing data points. For example, one
sample of workers, being motivated by Method A, is observed for 5
months. The relevant measure of productivity has a
correlation between time points of .4 and mean productivity
increases of 1% from one month to the next. Another sample of
workers being motivated by Method B, the experimental approach, is
similar in every way except that the mean is hypothesized to
increase by 2% each month. In both groups the residual variance is
3 at all time points. The correlation structure (e.g. compound
symmetry) between time points greatly affects the findings.
The existence of missing data, often occurring in experiments on
humans, complicates the statistical analysis and the power
analysis. No analytical formula exists to project the proper
sample size under these conditions. This poster reports on
work toward modeling this situation. Using SAS (Statistical
Analysis System) code, we will demonstrate how to calculate power
for this situation and how to easily modify the program to handle
an even wider range of models. A real-time computer
demonstration will be made as well.