Statistical tests

T-Test for Your Thesis: Complete Guide with Assumption Checks, Effect Size, and APA Copy-Paste Templates

Q: What is the difference between a t-test and a z-test?

A z-test is used when the population standard deviation is known and the sample is large (N > 30). In practice, you almost never know the population SD, so t-tests are used instead. For most thesis purposes, always use a t-test unless specifically instructed otherwise.

Q: What effect size is considered large for a t-test in thesis research?

For Cohen's d: small = 0.2, medium = 0.5, large = 0.8. A large effect (d = 0.8) means the two group means differ by 0.8 standard deviations - a clearly noticeable difference. Most thesis research in social sciences reports medium effects (d = 0.4–0.6). Always interpret effect size alongside practical significance for your field.

Q: What should I write in my thesis methods section when reporting a t-test?

State the test type, the dependent and independent variables, and the assumption check results. Example: "An independent samples t-test was used to compare exam scores between the treatment and control groups. Normality was assessed using Shapiro-Wilk and was not violated for either group (p = .14 and p = .22, respectively). Levene's test confirmed equal variances, F(1, 48) = 0.92, p = .342. All analyses were conducted using SPSS version 28."

The t-test is the most reported statistical test in student theses - and the one most often reported incorrectly. Skip the Shapiro-Wilk normality check and your supervisor will flag it in your defense. Report p < .05 without Cohen's d and you are missing half the story. This guide gives you exact software menu paths for three t-test types, a step-by-step assumption checklist, Cohen's d benchmarks, and ready-to-paste APA sentences for every outcome.

Free sample chapter

Data Analysis From Survey to Results

Step-by-step guidance for choosing the right test, running it, and writing up APA results - in plain language, not theory. Get the free sample chapter when you join the waitlist.

Statistical Analysis Step by Step book cover

Key takeaways

Always run Shapiro-Wilk before a t-test - if p < .05, switch to Mann-Whitney U (independent) or Wilcoxon (paired) instead.
Welch's t-test handles unequal variances automatically - most software reports it alongside the standard version; use it when Levene's p < .05.
Cohen's d is mandatory - small = 0.2, medium = 0.5, large = 0.8 - always report it next to your p-value.
For paired t-tests check normality of the difference scores, not the raw scores - this is the assumption that matters.
Your methods section must state the test type, software version, and results of Shapiro-Wilk and Levene's - not just the final t and p.

What the T-Test Actually Measures

A t-test compares two means and asks: is the difference between them large enough - relative to the variability in the data - to be unlikely due to chance?

The result is a t-statistic and a p-value. A significant result (p < .05) means you reject the null hypothesis that the two means are equal in the population.

Three versions exist for three different designs:
- Independent samples t-test: two separate groups
- Paired samples t-test: same subjects measured twice
- One-sample t-test: one group compared to a known value

Independent Samples T-Test: Software Menu Paths and What to Check

Use when: comparing two distinct, non-overlapping groups on a metric outcome.

Examples: male vs. female on exam score; treatment vs. control group; two university programmes.

Software paths:
SPSS → Analyze → Compare Means → Independent Samples T Test
Jamovi → Analyses → T-Tests → Independent Samples T-Test
JASP → T-Tests → Independent Samples T-Test

Assumption	Test to Run	What to Check	If Violated
Normality	Shapiro-Wilk (each group)	W statistic, p > .05	Use Mann-Whitney U
Equal variances	Levene's test	F statistic, p > .05	Use Welch's t-test
Independence	Study design review	No subject appears twice	Cannot fix post-hoc
Metric scale	Variable type check	Interval or ratio scale	Use Mann-Whitney U

⚠️

Never skip Shapiro-Wilk. A t-test on non-normally distributed data without justification is one of the most common thesis defense questions. Run it, report it, and note if the assumption is met.

Paired T-Test: Same Subjects Measured Twice

Use when: the same participants are measured at two time points, or you have matched pairs.

Examples: pre-test vs. post-test scores; scores before and after an intervention; matched sibling pairs.

Key assumption: normality applies to the difference scores (post − pre), not the raw scores. Run Shapiro-Wilk on the computed difference variable.

Software paths:
SPSS → Analyze → Compare Means → Paired-Samples T Test
Jamovi → Analyses → T-Tests → Paired Samples T-Test

If normality of differences fails → use Wilcoxon signed-rank test.

The paired test is more powerful than independent because it removes between-subject variability from the error term - you need fewer participants to detect the same effect.

One-Sample T-Test: Comparing Against a Known Value

Use when: you have one group and want to test whether its mean differs from a known reference value.

Examples: "Does my sample's mean (M = 67) differ from the population mean of 70?" or "Is customer satisfaction in my sample above the industry benchmark of 3.5 on a 5-point scale?"

Software paths:
SPSS → Analyze → Compare Means → One-Sample T Test → enter Test Value
Jamovi → T-Tests → One Sample T-Test → set the test value

This test is less common in theses but useful when comparing against published norms, national statistics, or theoretical benchmarks.

Effect Size: Cohen's d - What It Means and How to Calculate It

Cohen's d measures how many standard deviations the two means differ by. It tells you the practical importance of your finding, independent of sample size.

Cohen's d Value	Effect Size	Plain-Language Meaning
0.2	Small	Means differ by 0.2 SDs - subtle, may need large N to detect
0.5	Medium	Means differ by 0.5 SDs - noticeable in practice
0.8	Large	Means differ by 0.8 SDs - clearly visible difference
> 1.0	Very large	Means differ by more than 1 SD - strong, practical effect

APA Reporting Templates (Copy and Adapt)

Independent t-test - significant result:
"An independent samples t-test revealed a significant difference between Group A (M = 74.2, SD = 8.1) and Group B (M = 68.5, SD = 9.3), t(48) = 2.31, p = .025, d = 0.65."

Independent t-test - non-significant result:
"No significant difference was found between Group A (M = 72.1, SD = 8.4) and Group B (M = 70.3, SD = 9.1), t(48) = 0.72, p = .476, d = 0.21."

Paired t-test - significant result:
"A paired samples t-test indicated that scores increased significantly from pre-test (M = 61.3, SD = 9.4) to post-test (M = 71.8, SD = 8.7), t(29) = 5.42, p < .001, d = 0.99."

Welch's t-test (unequal variances):
"Levene's test indicated unequal variances (p = .018), so Welch's correction was applied. A significant difference was found, t(41.3) = 2.67, p = .011, d = 0.73."

Common T-Test Mistakes in Thesis Research

These are the errors supervisors flag most often during thesis reviews.

Mistake	Why It Is Wrong	How to Fix It
Skipping Shapiro-Wilk	Cannot justify parametric assumption	Always run and report W and p
Reporting p = .000	Mathematically impossible value	Write p < .001 instead
No Cohen's d	Statistical without practical significance	Calculate d and benchmark it
Using t-test on 3+ groups	Inflates Type I error rate	Use one-way ANOVA instead
Ignoring Levene's test	Variance inequality distorts results	Check and use Welch's if p < .05

Frequently asked questions

What is the difference between a t-test and a z-test?

▾

A z-test is used when the population standard deviation is known and the sample is large (N > 30). In practice, you almost never know the population SD, so t-tests are used instead. For most thesis purposes, always use a t-test unless specifically instructed otherwise.

How many participants do I need for a t-test to be valid?

▾

Technically a t-test can run on very small samples (even N = 5 per group), but the power to detect effects is low. A power analysis using G*Power (free software) will tell you the required sample size for your expected effect size. For a medium effect (Cohen's d = 0.5) with 80% power, you need approximately 52 participants per group for an independent t-test.

Can I use a t-test on Likert scale data?

▾

Strictly no - t-tests assume metric (interval/ratio) data. For single Likert items, use Mann-Whitney U (independent) or Wilcoxon (paired). For composite scores averaged across 5+ Likert items with Cronbach's alpha ≥ .70, many researchers use t-tests and defend this on the grounds that composites approximate interval measurement.

What is Welch's t-test and when should I use it?

▾

Welch's t-test is a variant of the independent samples t-test that does not assume equal variances. Use it when Levene's test is significant (p < .05). Most SPSS output includes both versions side by side - report the Welch's row and note the adjusted degrees of freedom.

What effect size is considered large for a t-test in thesis research?

▾

For Cohen's d: small = 0.2, medium = 0.5, large = 0.8. A large effect (d = 0.8) means the two group means differ by 0.8 standard deviations - a clearly noticeable difference. Most thesis research in social sciences reports medium effects (d = 0.4–0.6). Always interpret effect size alongside practical significance for your field.

What should I write in my thesis methods section when reporting a t-test?

▾

State the test type, the dependent and independent variables, and the assumption check results. Example: "An independent samples t-test was used to compare exam scores between the treatment and control groups. Normality was assessed using Shapiro-Wilk and was not violated for either group (p = .14 and p = .22, respectively). Levene's test confirmed equal variances, F(1, 48) = 0.92, p = .342. All analyses were conducted using SPSS version 28."

Not sure which statistical test to use?

Answer 5 quick questions about your research design and get the right test - with an explanation of why - in under two minutes.

Statoria Team

Statistics educators & software developers

We build Statoria to help bachelor and master students get through their thesis data analysis without stress. Our guides are written by researchers with experience in social science statistics and student supervision.