Data visualisation

Histogram Explained: How to Create, Read, and Use One for Thesis Normality Checks

April 20265 min read

A histogram is the first plot you should create for any metric variable in your thesis. It shows the shape of your distribution in 30 seconds - and tells you whether parametric tests like t-test and ANOVA are appropriate before you run a single analysis. This guide shows how to create a histogram in SPSS and Excel, how to read all five distribution shapes, and how to combine it correctly with Shapiro-Wilk for a complete normality check.

Free sample chapter

Data Analysis From Survey to Results

Step-by-step guidance for choosing the right test, running it, and writing up APA results - in plain language, not theory. Get the free sample chapter when you join the waitlist.

Statistical Analysis Step by Step book cover

Key takeaways

Create a histogram for every metric variable before running any parametric test - it is the fastest visual normality check.
A roughly bell-shaped histogram + Shapiro-Wilk p > .05 together justify using parametric tests.
Skewness direction: right-skewed = long tail to the right, mean pulled up; left-skewed = long tail to the left, mean pulled down.
Bimodal histogram (two peaks) suggests two subgroups in your data - check if a grouping variable explains the split.
Combine histogram + Shapiro-Wilk for your methods section: visual check alone is not sufficient evidence of normality.

What a Histogram Shows - And What It Does Not

A histogram divides the range of your variable into equal-width intervals (bins) and shows how many observations fall into each bin. The height of each bar = frequency in that interval.

What it shows: distribution shape (normal, skewed, bimodal), spread, centre, potential outliers.

What it does NOT show: individual data points (use a dot plot for that), exact values, or relationships between variables (use a scatterplot).

Histograms are essential diagnostic tools for assumption checking before t-tests, ANOVA, and Pearson correlation.

How to Create a Histogram in SPSS and Excel

SPSS Method 1 (via Explore - recommended):
Analyze → Descriptive Statistics → Explore → move variable to Dependent List → Plots → check Histogram → OK
Output includes histogram + normal curve overlay.

SPSS Method 2 (via Frequencies):
Analyze → Descriptive Statistics → Frequencies → Charts → Histograms → check 'Show normal curve on histogram' → Continue → OK

Excel:
Select the data column → Insert → Charts → Insert Statistic Chart → Histogram
Right-click x-axis → Format Axis → adjust Bin Width
For Excel 2013 and earlier: use Data Analysis ToolPak → Histogram

How to Read Distribution Shapes in a Histogram

Each shape has a different meaning for your analysis.

Shape	What It Looks Like	Statistical Implication
Normal (bell-shaped)	Symmetric peak in centre, tails taper equally	Parametric tests appropriate - proceed with t-test / ANOVA
Right-skewed (positive)	Peak left, long tail to the right, mean > median	Consider non-parametric tests; common in income, response times
Left-skewed (negative)	Peak right, long tail to the left, mean < median	Common in difficult exam scores; consider non-parametric
Bimodal (two peaks)	Two distinct humps	Two subpopulations present - investigate grouping variable
Uniform	Bars roughly equal height	Rare in social science; check if variable makes sense as continuous

Using a Histogram to Check Normality Before Parametric Tests

Use histogram + Shapiro-Wilk together. Neither alone is sufficient.

Step 1: Create the histogram in SPSS Explore (includes normal curve overlay)
Step 2: Run Shapiro-Wilk: Analyze → Descriptive Statistics → Explore → Plots → check 'Normality plots with tests'
Step 3: Interpret together:

💡

Combine histogram and Shapiro-Wilk for every normality check. If both agree (bell-shaped histogram + p > .05), parametric tests are justified. If they conflict (e.g., histogram looks normal but Shapiro-Wilk p < .05 in a large sample), trust Shapiro-Wilk - it is oversensitive with N > 100, so inspect Q-Q plots too.

Writing Up Histogram Findings in Your Thesis Methods Section

Copy and adapt these templates for your methods section:

Normality met:
"Normality was assessed for each variable using the Shapiro-Wilk test and visual inspection of histograms. All variables showed approximately normal distributions (all W > .95, all p > .05), supporting the use of parametric tests."

Normality violated:
"Visual inspection of histograms revealed a right-skewed distribution for [variable]. The Shapiro-Wilk test confirmed a significant deviation from normality, W(N) = .87, p = .012. Non-parametric alternatives were therefore used for this variable."

Frequently asked questions

What is the difference between a histogram and a bar chart?

▾

A histogram displays the distribution of a continuous variable - the x-axis is a continuous scale and bars touch each other because the data is continuous. A bar chart displays frequencies or values for discrete categories - the x-axis is categorical and bars are separated. Never use a bar chart to display the distribution of a continuous variable.

How many bins should my histogram have?

▾

A common rule of thumb is 5–20 bins, depending on sample size. Too few bins hide the shape; too many create noise. For a sample of N = 50–100, 8–12 bins is usually appropriate. Many tools (Excel, SPSS) choose bins automatically using Sturges' rule (k = 1 + log₂N). You can adjust manually if the automatic choice is misleading.

How do I describe a histogram in my thesis?

▾

Describe the shape, centre, and spread. Example: "The histogram for exam score (Figure 1) indicates an approximately normal distribution with most values clustering between 60 and 80 (M = 72.4, SD = 8.3). No extreme outliers were identified." If the distribution is skewed, describe the direction and note the impact on test selection.

Can I use a histogram for Likert scale data?

▾

Technically no - histograms assume continuous data with meaningful bin widths. For ordinal data (Likert scales), use a bar chart showing frequencies for each category value. The difference matters: a histogram implies the distance between bars is meaningful, which is not true for ordinal scales.

What is a Q-Q plot and how does it differ from a histogram for checking normality?

▾

A Q-Q (quantile-quantile) plot compares your data's distribution to a theoretical normal distribution point by point. If your data is normal, the points fall roughly on a diagonal line. Q-Q plots are more sensitive to deviations in the tails than histograms. Use both: histogram for overall shape, Q-Q plot for detail. SPSS produces Q-Q plots automatically alongside the Shapiro-Wilk test.

How many observations do I need for a histogram to be meaningful?

▾

With fewer than 20–30 observations, histograms can be misleading because small samples produce irregular shapes by chance. For very small samples, use a dot plot or stem-and-leaf plot instead, and rely on the Shapiro-Wilk test rather than visual inspection for normality assessment. Histograms become informative and reliable from around N = 50 upward.

Not sure which statistical test to use?

Answer 5 quick questions about your research design and get the right test - with an explanation of why - in under two minutes.

Statoria Team

Statistics educators & software developers

We build Statoria to help bachelor and master students get through their thesis data analysis without stress. Our guides are written by researchers with experience in social science statistics and student supervision.