Statoria Brand LogoStatoria
Statistical tests

T-Test for Your Thesis: Complete Guide with Assumption Checks, Effect Size, and APA Copy-Paste Templates

The t-test is the most reported statistical test in student theses - and the one most often reported incorrectly. Skip the Shapiro-Wilk normality check and your supervisor will flag it in your defense. Report p < .05 without Cohen's d and you are missing half the story. This guide gives you exact software menu paths for three t-test types, a step-by-step assumption checklist, Cohen's d benchmarks, and ready-to-paste APA sentences for every outcome.

Free sample chapter

Data Analysis From Survey to Results

Step-by-step guidance for choosing the right test, running it, and writing up APA results - in plain language, not theory. Get the free sample chapter when you join the waitlist.

Key takeaways

  • Always run Shapiro-Wilk before a t-test - if p < .05, switch to Mann-Whitney U (independent) or Wilcoxon (paired) instead.
  • Welch's t-test handles unequal variances automatically - most software reports it alongside the standard version; use it when Levene's p < .05.
  • Cohen's d is mandatory - small = 0.2, medium = 0.5, large = 0.8 - always report it next to your p-value.
  • For paired t-tests check normality of the difference scores, not the raw scores - this is the assumption that matters.
  • Your methods section must state the test type, software version, and results of Shapiro-Wilk and Levene's - not just the final t and p.

What the T-Test Actually Measures

A t-test compares two means and asks: is the difference between them large enough - relative to the variability in the data - to be unlikely due to chance?

The result is a t-statistic and a p-value. A significant result (p < .05) means you reject the null hypothesis that the two means are equal in the population.

  • Three versions exist for three different designs:
  • - Independent samples t-test: two separate groups
  • - Paired samples t-test: same subjects measured twice
  • - One-sample t-test: one group compared to a known value

Independent Samples T-Test: Software Menu Paths and What to Check

Use when: comparing two distinct, non-overlapping groups on a metric outcome.

Examples: male vs. female on exam score; treatment vs. control group; two university programmes.

  • Software paths:
  • SPSS → Analyze → Compare Means → Independent Samples T Test
  • Jamovi → Analyses → T-Tests → Independent Samples T-Test
  • JASP → T-Tests → Independent Samples T-Test
AssumptionTest to RunWhat to CheckIf Violated
NormalityShapiro-Wilk (each group)W statistic, p > .05Use Mann-Whitney U
Equal variancesLevene's testF statistic, p > .05Use Welch's t-test
IndependenceStudy design reviewNo subject appears twiceCannot fix post-hoc
Metric scaleVariable type checkInterval or ratio scaleUse Mann-Whitney U
⚠️

Never skip Shapiro-Wilk. A t-test on non-normally distributed data without justification is one of the most common thesis defense questions. Run it, report it, and note if the assumption is met.

Paired T-Test: Same Subjects Measured Twice

Use when: the same participants are measured at two time points, or you have matched pairs.

Examples: pre-test vs. post-test scores; scores before and after an intervention; matched sibling pairs.

Key assumption: normality applies to the difference scores (post − pre), not the raw scores. Run Shapiro-Wilk on the computed difference variable.

  • Software paths:
  • SPSS → Analyze → Compare Means → Paired-Samples T Test
  • Jamovi → Analyses → T-Tests → Paired Samples T-Test

If normality of differences fails → use Wilcoxon signed-rank test.

The paired test is more powerful than independent because it removes between-subject variability from the error term - you need fewer participants to detect the same effect.

One-Sample T-Test: Comparing Against a Known Value

Use when: you have one group and want to test whether its mean differs from a known reference value.

Examples: "Does my sample's mean (M = 67) differ from the population mean of 70?" or "Is customer satisfaction in my sample above the industry benchmark of 3.5 on a 5-point scale?"

  • Software paths:
  • SPSS → Analyze → Compare Means → One-Sample T Test → enter Test Value
  • Jamovi → T-Tests → One Sample T-Test → set the test value

This test is less common in theses but useful when comparing against published norms, national statistics, or theoretical benchmarks.

Effect Size: Cohen's d - What It Means and How to Calculate It

Cohen's d measures how many standard deviations the two means differ by. It tells you the practical importance of your finding, independent of sample size.

Cohen's d ValueEffect SizePlain-Language Meaning
0.2SmallMeans differ by 0.2 SDs - subtle, may need large N to detect
0.5MediumMeans differ by 0.5 SDs - noticeable in practice
0.8LargeMeans differ by 0.8 SDs - clearly visible difference
> 1.0Very largeMeans differ by more than 1 SD - strong, practical effect

APA Reporting Templates (Copy and Adapt)

  • Independent t-test - significant result:
  • "An independent samples t-test revealed a significant difference between Group A (M = 74.2, SD = 8.1) and Group B (M = 68.5, SD = 9.3), t(48) = 2.31, p = .025, d = 0.65."
  • Independent t-test - non-significant result:
  • "No significant difference was found between Group A (M = 72.1, SD = 8.4) and Group B (M = 70.3, SD = 9.1), t(48) = 0.72, p = .476, d = 0.21."
  • Paired t-test - significant result:
  • "A paired samples t-test indicated that scores increased significantly from pre-test (M = 61.3, SD = 9.4) to post-test (M = 71.8, SD = 8.7), t(29) = 5.42, p < .001, d = 0.99."
  • Welch's t-test (unequal variances):
  • "Levene's test indicated unequal variances (p = .018), so Welch's correction was applied. A significant difference was found, t(41.3) = 2.67, p = .011, d = 0.73."

Common T-Test Mistakes in Thesis Research

These are the errors supervisors flag most often during thesis reviews.

MistakeWhy It Is WrongHow to Fix It
Skipping Shapiro-WilkCannot justify parametric assumptionAlways run and report W and p
Reporting p = .000Mathematically impossible valueWrite p < .001 instead
No Cohen's dStatistical without practical significanceCalculate d and benchmark it
Using t-test on 3+ groupsInflates Type I error rateUse one-way ANOVA instead
Ignoring Levene's testVariance inequality distorts resultsCheck and use Welch's if p < .05

Frequently asked questions

What is the difference between a t-test and a z-test?

A z-test is used when the population standard deviation is known and the sample is large (N > 30). In practice, you almost never know the population SD, so t-tests are used instead. For most thesis purposes, always use a t-test unless specifically instructed otherwise.

How many participants do I need for a t-test to be valid?

Technically a t-test can run on very small samples (even N = 5 per group), but the power to detect effects is low. A power analysis using G*Power (free software) will tell you the required sample size for your expected effect size. For a medium effect (Cohen's d = 0.5) with 80% power, you need approximately 52 participants per group for an independent t-test.

Can I use a t-test on Likert scale data?

Strictly no - t-tests assume metric (interval/ratio) data. For single Likert items, use Mann-Whitney U (independent) or Wilcoxon (paired). For composite scores averaged across 5+ Likert items with Cronbach's alpha ≥ .70, many researchers use t-tests and defend this on the grounds that composites approximate interval measurement.

What is Welch's t-test and when should I use it?

Welch's t-test is a variant of the independent samples t-test that does not assume equal variances. Use it when Levene's test is significant (p < .05). Most SPSS output includes both versions side by side - report the Welch's row and note the adjusted degrees of freedom.

What effect size is considered large for a t-test in thesis research?

For Cohen's d: small = 0.2, medium = 0.5, large = 0.8. A large effect (d = 0.8) means the two group means differ by 0.8 standard deviations - a clearly noticeable difference. Most thesis research in social sciences reports medium effects (d = 0.4–0.6). Always interpret effect size alongside practical significance for your field.

What should I write in my thesis methods section when reporting a t-test?

State the test type, the dependent and independent variables, and the assumption check results. Example: "An independent samples t-test was used to compare exam scores between the treatment and control groups. Normality was assessed using Shapiro-Wilk and was not violated for either group (p = .14 and p = .22, respectively). Levene's test confirmed equal variances, F(1, 48) = 0.92, p = .342. All analyses were conducted using SPSS version 28."

Free tool

Not sure which statistical test to use?

Answer 5 quick questions about your research design and get the right test - with an explanation of why - in under two minutes.

Statoria Team

Statistics educators & software developers

We build Statoria to help bachelor and master students get through their thesis data analysis without stress. Our guides are written by researchers with experience in social science statistics and student supervision.

Related guides