ANOVA: What It Means
In a Six Sigma DMAIC project it is often necessary to compare data for different groups or conditions to determine if any differences exist. In the Analyze phase, we seek to validate root causes by verifying whether manipulating an input variable affects process performance or another outcome measure such as customer satisfaction. During the improve phase, we seek to confirm that implementing process improvements actually results in a change in one or more process metrics. Other projects that involve data measurement may also require a means of comparing group data.
To compare group data when the input variable is discrete and the output variable is continuous, use the ANOVA test. This test actually compares the variance within each group to the variance among all the groups to determine whether any differences among the groups exist. In other words, it determines whether differences among samples in different groups exist solely because of random variation affecting all groups or whether something specific about a condition itself creates a difference.
The ANOVA is based on calculation of the F-statistic, which is the result when you divide the variance between groups by the variance within groups. If there are no differences among groups, those two values are equal, resulting in an F value of 1. If F is significantly different from 1, as determined by consulting an F table, you would conclude that the null hypothesis does not hold and that there is at least one group that differs from at least one other group.
Interpreting Analysis of Variance
While ANOVA can confirm that there is a difference among groups, it does not tell you which groups are significantly different from which other groups. You will need to examine a chart of the data and possibly conduct additional testing to confirm which groups differ. A statistical program such as Minitab usually provides a full analysis when you run an ANOVA that includes a table of means and standard deviations for each group plus a chart showing means and confidence intervals. If ANOVA shows that there is a difference among groups, you can start by identifying the two groups that differ most. They generally have non-overlapping confidence intervals.
The ANOVA carries with it several assumptions:
- The variances of the groups are equal. This is known as homoscedasticity. Your statistics software may conduct a test for equal variance, such as Bartlett’s test or Levene’s test, as part of the ANOVA, or you may need to run this analysis yourself.
- The sample data is representative of the population overall. This holds true for all statistical testing.
- The data in each group are distributed normally. Run a test of normality to confirm this.
- For process data as in a Six Sigma project, the process is stable, meaning only common cause variation is present and there are no trends.
If these assumptions do not hold for your data, you run the risk of making incorrect conclusions about differences among groups.
Note that a different test, the Analysis of Means (ANOM), allows you to determine whether any groups differ from the overall average of the groups. Depending on your analysis needs you may choose to conduct the ANOM instead of or in addition to the ANOVA.