Statistical tests

One-Way ANOVA for Your Thesis: Complete Guide with Post-Hoc Tests, Effect Size, and APA Templates

Q: What is the difference between one-way and two-way ANOVA?

One-way ANOVA has one independent variable (factor) with three or more levels. Two-way ANOVA has two independent variables and tests main effects of each factor plus the interaction between them. Two-way ANOVA is used when your design involves two grouping variables - for example, studying the effects of both treatment type and gender on an outcome.

Q: Do I still need post-hoc tests if my ANOVA is not significant?

No. If the omnibus F-test is not significant (p > .05), you stop there. A non-significant ANOVA means there is insufficient evidence that any group means differ. Running post-hoc tests anyway is p-hacking and should not be reported.

Q: What is Kruskal-Wallis and when should I use it instead of ANOVA?

Kruskal-Wallis is the non-parametric equivalent of one-way ANOVA. Use it when the normality assumption fails (Shapiro-Wilk p < .05) and you cannot justify treating the data as approximately normal. It tests whether the rank distributions of three or more groups differ, rather than comparing means directly. Report the H-statistic, degrees of freedom, and p-value.

Q: What is the difference between eta-squared and partial eta-squared in ANOVA?

Eta-squared (η²) = SS_between ÷ SS_total: the proportion of total variance explained by the factor. Partial eta-squared (ηp²) = SS_between ÷ (SS_between + SS_error): the proportion of variance explained after removing variance from other factors in the model. For one-way ANOVA with a single factor, both values are identical. For factorial ANOVA with multiple factors, report ηp² for each factor separately.

Q: How many participants do I need per group for a reliable one-way ANOVA?

A common minimum is 20–30 participants per group for medium effects. Run a G*Power analysis before data collection to confirm: for a one-way ANOVA with 3 groups, medium effect (f = .25), 80% power, you need approximately 52 participants total (about 17 per group). Smaller groups reduce your ability to detect real differences.

ANOVA is an omnibus test - it tells you that at least one group mean differs, but not which ones. That is why a significant F-statistic is only the beginning: you still need post-hoc tests to identify the specific pairs and effect size to quantify the magnitude. This guide walks through the three assumptions to check before running ANOVA, the Tukey vs. Games-Howell decision, eta-squared interpretation, and ready-to-paste APA sentences.

Free sample chapter

Data Analysis From Survey to Results

Step-by-step guidance for choosing the right test, running it, and writing up APA results - in plain language, not theory. Get the free sample chapter when you join the waitlist.

Statistical Analysis Step by Step book cover

Key takeaways

ANOVA's F-test is omnibus - a significant result means at least one group differs, not which ones. Always follow up with post-hoc tests.
Post-hoc rule: Tukey when variances are equal (Levene's p > .05); Games-Howell when variances are unequal (Levene's p < .05).
Never run post-hoc tests when F is not significant - doing so is p-hacking regardless of any individual pair result.
Effect size η² benchmarks: small = .01, medium = .06, large = .14 - always report it alongside F and p.
If normality fails (Shapiro-Wilk p < .05), switch to Kruskal-Wallis - the non-parametric equivalent for 3+ groups.

What ANOVA Does - And Why Not Just Use Multiple T-Tests?

ANOVA (Analysis of Variance) tests whether the means of three or more independent groups differ significantly. It compares the variance between groups to the variance within groups - if between-group variance is substantially larger, the F-statistic is large and the p-value is small.

Approach	Groups	Type I Error Rate	Correct?
Three separate t-tests	A vs B, A vs C, B vs C	~14% (inflated)	No
One-way ANOVA	A, B, C simultaneously	5% (controlled)	Yes

⚠️

Running multiple t-tests for 3+ groups inflates your Type I error rate (false positive risk). With three tests each at α = .05, the combined error rate rises to ~14%. ANOVA controls this by testing all groups simultaneously.

Three Assumptions You Must Check Before Running ANOVA

Check all three before running your analysis. Document the results in your methods section.

Assumption	Test to Run	Decision Rule	If Violated
Normality per group	Shapiro-Wilk (each group)	p > .05 = met	Use Kruskal-Wallis
Equal variances	Levene's test	p > .05 = met	Use Welch's ANOVA + Games-Howell
Independence	Study design check	No participant in multiple groups	Cannot fix post-hoc

The F-Statistic: What It Measures and What It Does Not Tell You

The F-statistic = variance between groups ÷ variance within groups.

A large F means group means vary more than expected from random sampling - evidence against the null hypothesis that all means are equal.

What F does NOT tell you:
- Which specific groups differ (post-hoc tests do that)
- How large the effect is (η² does that)
- Whether the difference is practically meaningful (effect size + context do that)

ANOVA is an omnibus test. Treat a significant F as permission to proceed, not as your final conclusion.

Post-Hoc Tests: Tukey vs. Games-Howell - Which Groups Differ?

Run post-hoc tests only after a significant F. Choose based on Levene's test result.

Post-Hoc Test	When to Use	Variances Assumed Equal?	Controls Type I Error?
Tukey's HSD	Equal group sizes, Levene's p > .05	Yes	Yes (conservative)
Bonferroni	Specific planned comparisons only	Yes	Yes (very conservative)
Games-Howell	Levene's p < .05 (unequal variances)	No	Yes (robust)
LSD (Fisher)	Exploratory only, not recommended	Yes	No (liberal)

Effect Size: Eta-Squared (η²) - How Much Variance Is Explained?

η² = SS_between ÷ SS_total. It represents the proportion of total variance in the outcome explained by group membership.

Cohen's benchmarks:
- Small: η² = .01
- Medium: η² = .06
- Large: η² = .14

SPSS reports Partial Eta Squared (ηp²) in the output. For one-way ANOVA with a single factor, η² and ηp² are identical.

If your η² = .16, it means 16% of the variance in your outcome is explained by group - a large effect in Cohen's framework.

APA Reporting Templates (Copy and Adapt)

Complete report - significant ANOVA:
"A one-way ANOVA revealed a significant effect of study condition on exam score, F(2, 87) = 8.42, p < .001, η² = .16. Post-hoc comparisons using Tukey's HSD indicated that the spaced practice group (M = 78.3, SD = 9.1) scored significantly higher than the massed practice group (M = 68.7, SD = 11.4, p = .003) and the control group (M = 65.2, SD = 10.8, p < .001). The massed practice and control groups did not differ significantly (p = .421)."

Welch's ANOVA (unequal variances):
"Levene's test indicated unequal variances across groups (p = .011), so Welch's ANOVA was used. A significant effect was found, F(2, 41.3) = 6.87, p = .003, η² = .13. Post-hoc comparisons used Games-Howell."

Non-significant ANOVA:
"A one-way ANOVA revealed no significant difference between groups, F(2, 87) = 1.24, p = .294, η² = .03."

Frequently asked questions

What is the difference between one-way and two-way ANOVA?

▾

One-way ANOVA has one independent variable (factor) with three or more levels. Two-way ANOVA has two independent variables and tests main effects of each factor plus the interaction between them. Two-way ANOVA is used when your design involves two grouping variables - for example, studying the effects of both treatment type and gender on an outcome.

Do I still need post-hoc tests if my ANOVA is not significant?

▾

No. If the omnibus F-test is not significant (p > .05), you stop there. A non-significant ANOVA means there is insufficient evidence that any group means differ. Running post-hoc tests anyway is p-hacking and should not be reported.

What do I do if Levene's test is significant?

▾

A significant Levene's test (p < .05) means the variance assumption is violated. Switch to Welch's ANOVA and use Games-Howell for post-hoc comparisons. SPSS offers Welch's F in the same output window as the standard ANOVA. Report it the same way but note that Welch's correction was applied.

What is Kruskal-Wallis and when should I use it instead of ANOVA?

▾

Kruskal-Wallis is the non-parametric equivalent of one-way ANOVA. Use it when the normality assumption fails (Shapiro-Wilk p < .05) and you cannot justify treating the data as approximately normal. It tests whether the rank distributions of three or more groups differ, rather than comparing means directly. Report the H-statistic, degrees of freedom, and p-value.

What is the difference between eta-squared and partial eta-squared in ANOVA?

▾

Eta-squared (η²) = SS_between ÷ SS_total: the proportion of total variance explained by the factor. Partial eta-squared (ηp²) = SS_between ÷ (SS_between + SS_error): the proportion of variance explained after removing variance from other factors in the model. For one-way ANOVA with a single factor, both values are identical. For factorial ANOVA with multiple factors, report ηp² for each factor separately.

How many participants do I need per group for a reliable one-way ANOVA?

▾

A common minimum is 20–30 participants per group for medium effects. Run a G*Power analysis before data collection to confirm: for a one-way ANOVA with 3 groups, medium effect (f = .25), 80% power, you need approximately 52 participants total (about 17 per group). Smaller groups reduce your ability to detect real differences.

Not sure which statistical test to use?

Answer 5 quick questions about your research design and get the right test - with an explanation of why - in under two minutes.

Statoria Team

Statistics educators & software developers

We build Statoria to help bachelor and master students get through their thesis data analysis without stress. Our guides are written by researchers with experience in social science statistics and student supervision.