Compute sample size and power for a logistic regression in Excel
This tutorial explains how to calculate the sample size and power for a logistic regression in Excel using XLSTAT.
What is the power of a statistical test?
When testing a hypothesis using a statistical test, there are several decisions to take:
The null hypothesis H0 and the alternative hypothesis Ha.
The statistical test to use.
The type I error also known as alpha. It occurs when one rejects the null hypothesis when it is true. It is set a priori for each test and is 5%.
The type II error or beta is less studied but is of great importance. In fact, it represents the probability that one does not reject the null hypothesis when it is false. We can not fix it upfront, but based on other parameters of the model we can try to minimize it. The power of a test is calculated as 1-beta and represents the probability that we reject the null hypothesis when it is false.
We therefore wish to maximize the power of the test. XLSTAT calculates the power (and beta) when other parameters are known. For a given power, it also allows to calculate the sample size that is necessary to reach that power.
The statistical power calculations are usually done before the experiment is conducted. The main application of power calculations is to estimate the number of observations necessary to properly conduct an experiment.
Goal of this tutorial
We want to study the relationship between cancer occurence and a diet program. The baseline probability of occurrence of this cancer is 6%. It is assumed that the odds ratio is 2, and that 25% of people in the sample are following this particular diet program.
In this tutorial, we will find out what is the right sample size to perform this study in order to obtain a power of 0.9.
Setting up the sample size calculation for a logistic regression
Once XLSTAT has been launched, click on the Power icon and choose Logistic regression.
Once the button has been clicked, the dialog box pops up.
You must then choose the Find sample size objective.
The alpha is 0.05. The desired power is 0.9.
As explained earlier, we take a reference probability of 0.06 and an odds ratio of 2.
In the Chart tab, the simulation plot option is activated and the size of sample 1 will be represented on the vertical axis and the power on the horizontal axis.
The power varies between 0.8 and 0.95 in intervals of 0.01.
Once you click on the OK button, the calculations are done and then the results are displayed.
Interpret the results of sample size calculations for a logistic regression
The first table gathers the parameters used as input.
The second table gathers the results of the calculation as well as an interpretation.
We see that 3011 observations are needed per sample to obtain a power as close as possible to 0.9.
The following table gathers the calculations obtained for each value of the power between 0.8 and 0.95.
The simulation plot shows the evolution of the sample size as a function of the power. We see that for a power of 0.8, 2190 observations are enough and that for a power of 0.95 we arrive at 3788 observations.
XLSTAT is therefore a powerful tool both for finding the sample size required for an analysis and for calculating the power of a test. If we are not sure of our sample parameters, it is also possible to restart the analysis with a different odds ratio or a percentage of N with X1 = 1 which can reassure us about the sample size to choose.
Was this article useful?