What is the difference between the Shapiro-Wilk and Kolmogorov-Smirnov tests?

The Shapiro-Wilk test is generally more powerful and is recommended for samples under 2,000 cases. The Kolmogorov-Smirnov test (with Lilliefors correction) is less sensitive, especially in small samples, and may fail to detect non-normality that the Shapiro-Wilk test catches. Most statisticians and thesis committees prefer the Shapiro-Wilk test. SPSS outputs both tests simultaneously through the Explore procedure, so you can report both if your committee expects it.

My Shapiro-Wilk test is significant (p < .05). Does this mean I cannot use parametric tests?

Not necessarily. A significant Shapiro-Wilk result means the distribution deviates statistically from perfect normality, but parametric tests such as t-tests and ANOVA are robust to moderate violations, especially with samples larger than 30. Evaluate the practical severity by checking skewness/kurtosis values and Q-Q plots. If skewness falls between -1 and +1, the deviation is usually mild enough that parametric tests remain appropriate. For severe violations (skewness beyond +/-2 or clearly curved Q-Q plots), consider non-parametric alternatives or data transformations.

How do I report normality test results in APA format?

Report the test statistic, degrees of freedom, and p-value. For Shapiro-Wilk: 'The Shapiro-Wilk test indicated that Satisfaction scores were normally distributed, W(148) = .991, p = .472.' For Kolmogorov-Smirnov: 'The Kolmogorov-Smirnov test was not significant, D(148) = .054, p = .200, suggesting normality.' Include supplementary evidence such as skewness/kurtosis values or a statement that Q-Q plots showed no systematic deviation from the diagonal.

Should I test normality on the raw variables or on the residuals?

It depends on the analysis. For t-tests and ANOVA, normality of the dependent variable within each group (or of the residuals) is the relevant assumption. For regression, the assumption applies to the residuals, not the raw predictor or outcome variables. In practice, many thesis committees accept normality checks on the raw variables as a preliminary screening step, with residual checks performed during the actual analysis. If you are running regression, test normality of the residuals after fitting the model.

What skewness and kurtosis values indicate a normal distribution?

A perfectly normal distribution has a skewness of 0 and excess kurtosis of 0. In practice, values between -1 and +1 for both measures suggest approximate normality suitable for parametric tests. Values between -2 and +2 are often considered acceptable, particularly with larger samples. Beyond +/-2, the distribution deviates substantially, and you should consider non-parametric tests or data transformations. Always interpret these values alongside visual methods (histograms, Q-Q plots) rather than relying on a single threshold.

How large does my sample need to be for a normality test?

The Shapiro-Wilk test works well with samples from about 3 to 2,000 cases. For very small samples (n 300), even trivial, practically unimportant deviations from normality will produce a significant result. In large samples, prioritize visual methods and skewness/kurtosis values over the p-value from formal tests.

Can I check normality for different groups separately in SPSS?

Yes. In the Explore procedure, add your grouping variable (e.g., Gender) to the Factor List box. SPSS will produce separate normality tests, histograms, and Q-Q plots for each group. This is important when your subsequent analysis compares groups (independent t-test, ANOVA), because the normality assumption applies within each group, not to the combined sample.

What is the difference between normality of the variable and normality of residuals?

Normality of the variable means the scores themselves follow a bell-shaped distribution. Normality of residuals means the errors (differences between observed and predicted values) from a statistical model are normally distributed. For simple comparisons like t-tests, checking the variable within each group is sufficient. For regression and ANOVA, the formal assumption concerns the residuals. In practice, if the raw variables are approximately normal, the residuals usually are as well, but it is good practice to verify after fitting the model.

Is it possible to have a normal Q-Q plot but a significant Shapiro-Wilk test?

Yes, and this is common with larger samples. With 200 or more cases, the Shapiro-Wilk test can detect very minor deviations that are invisible on a Q-Q plot and have no practical impact on parametric test results. When the Q-Q plot looks acceptable but the Shapiro-Wilk p-value is below .05, report both findings and explain that the deviation is statistically detectable but practically negligible. Most committees will accept this reasoning, especially if skewness and kurtosis values are within acceptable ranges.

How to Check Normality in SPSS (Shapiro-Wilk, K-S, Q-Q Plots)

Q: My Shapiro-Wilk test is significant (p < .05). Does this mean I cannot use parametric tests?

Not necessarily. A significant Shapiro-Wilk result means the distribution deviates statistically from perfect normality, but parametric tests such as t-tests and ANOVA are robust to moderate violations, especially with samples larger than 30. Evaluate the practical severity by checking skewness/kurtosis values and Q-Q plots. If skewness falls between -1 and +1, the deviation is usually mild enough that parametric tests remain appropriate. For severe violations (skewness beyond +/-2 or clearly curved Q-Q plots), consider non-parametric alternatives or data transformations.

Q: How do I report normality test results in APA format?

Report the test statistic, degrees of freedom, and p-value. For Shapiro-Wilk: 'The Shapiro-Wilk test indicated that Satisfaction scores were normally distributed, W(148) = .991, p = .472.' For Kolmogorov-Smirnov: 'The Kolmogorov-Smirnov test was not significant, D(148) = .054, p = .200, suggesting normality.' Include supplementary evidence such as skewness/kurtosis values or a statement that Q-Q plots showed no systematic deviation from the diagonal.

Q: Should I test normality on the raw variables or on the residuals?

It depends on the analysis. For t-tests and ANOVA, normality of the dependent variable within each group (or of the residuals) is the relevant assumption. For regression, the assumption applies to the residuals, not the raw predictor or outcome variables. In practice, many thesis committees accept normality checks on the raw variables as a preliminary screening step, with residual checks performed during the actual analysis. If you are running regression, test normality of the residuals after fitting the model.

Q: What skewness and kurtosis values indicate a normal distribution?

A perfectly normal distribution has a skewness of 0 and excess kurtosis of 0. In practice, values between -1 and +1 for both measures suggest approximate normality suitable for parametric tests. Values between -2 and +2 are often considered acceptable, particularly with larger samples. Beyond +/-2, the distribution deviates substantially, and you should consider non-parametric tests or data transformations. Always interpret these values alongside visual methods (histograms, Q-Q plots) rather than relying on a single threshold.

Q: How large does my sample need to be for a normality test?

The Shapiro-Wilk test works well with samples from about 3 to 2,000 cases. For very small samples (n 300), even trivial, practically unimportant deviations from normality will produce a significant result. In large samples, prioritize visual methods and skewness/kurtosis values over the p-value from formal tests.

Q: Can I check normality for different groups separately in SPSS?

Yes. In the Explore procedure, add your grouping variable (e.g., Gender) to the Factor List box. SPSS will produce separate normality tests, histograms, and Q-Q plots for each group. This is important when your subsequent analysis compares groups (independent t-test, ANOVA), because the normality assumption applies within each group, not to the combined sample.

Q: What is the difference between normality of the variable and normality of residuals?

Normality of the variable means the scores themselves follow a bell-shaped distribution. Normality of residuals means the errors (differences between observed and predicted values) from a statistical model are normally distributed. For simple comparisons like t-tests, checking the variable within each group is sufficient. For regression and ANOVA, the formal assumption concerns the residuals. In practice, if the raw variables are approximately normal, the residuals usually are as well, but it is good practice to verify after fitting the model.

Q: Is it possible to have a normal Q-Q plot but a significant Shapiro-Wilk test?

Yes, and this is common with larger samples. With 200 or more cases, the Shapiro-Wilk test can detect very minor deviations that are invisible on a Q-Q plot and have no practical impact on parametric test results. When the Q-Q plot looks acceptable but the Shapiro-Wilk p-value is below .05, report both findings and explain that the deviation is statistically detectable but practically negligible. Most committees will accept this reasoning, especially if skewness and kurtosis values are within acceptable ranges.

Normality testing is one of the most common steps thesis committees expect to see before any parametric analysis. T-tests, ANOVA, Pearson correlation, and linear regression all assume that the data (or the residuals) are approximately normally distributed. Skipping this step is one of the most frequent criticisms in thesis defenses.

SPSS provides four methods for checking normality: the Shapiro-Wilk test, the Kolmogorov-Smirnov test, Q-Q plots, and histograms with normal curves. This guide covers all four, explains how to interpret the output, and shows what to do when your data fails the normality assumption.

Key Takeaways:

Use the Explore procedure (Analyze > Descriptive Statistics > Explore) to run all normality checks in one step: Shapiro-Wilk, K-S, histograms, and Q-Q plots
The Shapiro-Wilk test is more powerful than Kolmogorov-Smirnov and is preferred for most thesis work (samples up to 2,000)
A significant test (p < .05) means the data deviates from normality, but practical severity matters more than the p-value, especially in large samples
Skewness between -1 and +1 generally indicates acceptable normality for parametric tests; values beyond +/-2 indicate a serious problem
When normality fails, you have three options: non-parametric alternatives, data transformations, or proceeding with justification (robust parametric tests)

Before you begin: This guide assumes you have your data loaded in SPSS with variables properly defined in Variable View. If you have not run descriptive statistics yet, start with our guide on descriptive statistics in SPSS. You should also verify that your continuous variables have the measurement level set to Scale in Variable View.

Why Normality Matters for Your Thesis

Parametric statistical tests (t-tests, ANOVA, Pearson correlation, linear regression) rely on the assumption that the data follows a roughly bell-shaped, symmetric distribution. When this assumption is severely violated, the p-values produced by these tests can be inaccurate, leading to incorrect conclusions about your hypotheses.

The practical question for thesis work is not whether your data is perfectly normal. No real-world dataset is. The question is whether the deviation from normality is severe enough to compromise your analysis. This distinction matters because:

Parametric tests are robust to moderate violations. Research consistently shows that t-tests and ANOVA perform well even when the normality assumption is modestly violated, particularly with sample sizes above 30 (Glass, Peckham, & Sanders, 1972; Schmider et al., 2010). A slightly skewed distribution will not invalidate your results.

With large samples, formal tests become overly sensitive. The Shapiro-Wilk test will flag a trivial deviation from normality as "significant" when you have 300 or more cases, even though the deviation has zero practical impact on your parametric test results.

With small samples, formal tests lack power. When you have fewer than 20 cases, both the Shapiro-Wilk and K-S tests may fail to detect genuinely non-normal distributions. Visual methods become essential.

This is why experienced researchers use multiple methods together rather than relying on a single test or a single threshold.

Which Normality Method to Use

SPSS offers several approaches to assessing normality. Each has strengths and limitations, and your thesis is strongest when you combine at least two.

Method	Type	Strengths	Limitations	Best For
Shapiro-Wilk	Statistical test	Most powerful formal test; widely recommended	Overly sensitive in large samples (n > 300); limited to n < 2,000	Primary formal test for most thesis work
Kolmogorov-Smirnov (Lilliefors)	Statistical test	No upper sample size limit	Low power in small samples; less sensitive than Shapiro-Wilk	Secondary test; required by some committees
Q-Q Plot	Visual	Shows exactly where and how the distribution deviates; unaffected by sample size	Requires subjective interpretation	Essential complement to formal tests
Histogram	Visual	Intuitive; reveals distribution shape, gaps, and clusters	Hard to judge with small samples; bin width affects appearance	Quick initial screening
Skewness/Kurtosis	Numeric	Simple thresholds; easy to report	No formal hypothesis test; thresholds are conventions, not rules	Supplementary evidence for APA reporting

Table 1: Comparison of normality assessment methods in SPSS

Recommended approach for thesis work: Run the Shapiro-Wilk test as your primary formal test, inspect the Q-Q plot for visual confirmation, and report skewness/kurtosis values as supplementary evidence. Include histograms in your appendix if your committee wants to see the distribution shapes.

Example Dataset

This tutorial uses the same thesis dataset from our descriptive statistics guide. You can download it from the sidebar. The dataset includes 150 respondents with variables specifically designed to demonstrate both normal and non-normal distributions:

Satisfaction (Scale, 1.00 to 5.00): approximately normally distributed based on skewness and kurtosis, though the Shapiro-Wilk test flags it as significant due to sample size sensitivity.
StudyHours (Scale, 3.40 to 35.00): right-skewed with two mild outliers. This variable has genuine normality violations visible in both the test statistics and the visual output.
GPA (Scale, 1.80 to 4.00): approximately normally distributed.
Age (Scale, 18 to 40): approximately normally distributed.

Working with both normal and non-normal variables in the same analysis is intentional. It also creates the exact scenario most thesis students encounter: a significant Shapiro-Wilk result that does not necessarily mean the data is unusable for parametric tests.

Step-by-Step: Running Normality Tests in SPSS

The Explore procedure is the most efficient way to check normality because it produces all four methods (Shapiro-Wilk, K-S, Q-Q plots, and histograms) in a single run.

Step 1: Open the Explore Dialog

Go to Analyze > Descriptive Statistics > Explore.

Figure 1: Navigate to Analyze > Descriptive Statistics > Explore in the SPSS menu

Step 2: Select Your Variables

Move the continuous variables you want to test into the Dependent List box. For this tutorial, move Satisfaction, StudyHours, GPA, and Age into the Dependent List.

Do not include categorical variables (Gender, EducationLevel) in the Dependent List. Normality testing applies to continuous variables only.

If you need to test normality within groups (for example, Satisfaction scores for males and females separately), add the grouping variable to the Factor List box. This is important when your subsequent analysis compares groups, because the normality assumption applies within each group.

SPSS Explore dialog box for normality testing with four continuous variables in the Dependent List

Figure 2: Explore dialog with Satisfaction, StudyHours, GPA, and Age in the Dependent List

Step 3: Enable Normality Plots

Click the Plots button in the Explore dialog. In the Plots sub-dialog:

Check Normality plots with tests. This enables the Shapiro-Wilk and Kolmogorov-Smirnov tests along with Q-Q plots.
Under Boxplots, keep Factor levels together selected (the default).
Optionally check Histogram if you want histograms included in the output.

Click Continue to return to the main dialog.

SPSS Explore Plots sub-dialog with Normality plots with tests and Histogram options checked

Figure 3: Explore Plots dialog: check "Normality plots with tests" and optionally "Histogram"

Step 4: Run the Analysis

Click OK in the main Explore dialog. SPSS will produce the output in the Output Viewer, including:

A Descriptives table (with skewness and kurtosis values)
The Tests of Normality table (Shapiro-Wilk and Kolmogorov-Smirnov)
Normal Q-Q plots for each variable
Detrended Normal Q-Q plots
Histograms (if selected)
Boxplots

Interpreting the Output

The Tests of Normality Table

This is typically the first table your committee will look at. SPSS produces both the Kolmogorov-Smirnov (with Lilliefors correction) and Shapiro-Wilk test results side by side.

SPSS Tests of Normality output table with Shapiro-Wilk and Kolmogorov-Smirnov test results for four variables

Figure 4: Tests of Normality table with Kolmogorov-Smirnov and Shapiro-Wilk results for all four variables

How to read this table:

Statistic: The test statistic value. For Shapiro-Wilk, values closer to 1.000 indicate stronger normality. For K-S, smaller values indicate better fit to a normal distribution.
df: Degrees of freedom (equals your sample size for each variable, minus any missing cases).
Sig.: The p-value. This is the number that determines your conclusion.

Decision rule:

If Sig. > .05: The data does not significantly deviate from normality. You can proceed with parametric tests.
If Sig. < .05: The data significantly deviates from normality. Further investigation is needed (check the severity using skewness/kurtosis and Q-Q plots).

In the example output, notice that all four variables have significant Shapiro-Wilk results (p < .05). This is a common scenario with samples of 100 or more cases, and it illustrates why you should never rely on the p-value alone.

The critical distinction is between statistically significant and practically important deviations. StudyHours (W = .910, p < .001) is genuinely non-normal with a skewness of 1.39 and kurtosis of 3.05. But Satisfaction (W = .969, p = .003), GPA (W = .976, p = .014), and Age (W = .954, p < .001) all have skewness and kurtosis values well within the -1 to +1 range. The Shapiro-Wilk test detected trivial deviations in these three variables that have no practical impact on parametric test results. The Q-Q plots and skewness/kurtosis values (covered below) confirm this.

Also notice that the K-S test tells a different story: Satisfaction (p = .090) and GPA (p = .200) are non-significant, while StudyHours (p < .001) and Age (p < .001) are significant. This demonstrates the power difference between the two tests. K-S missed the minor deviations that Shapiro-Wilk detected.

Which test to report? Use the Shapiro-Wilk result as your primary test. It is more powerful (better at detecting non-normality) and is recommended by most statistics textbooks (Field, 2018; Pallant, 2020). Report the K-S result alongside it if your committee or journal requires both. But always supplement the formal test with skewness/kurtosis values and Q-Q plots, as this example demonstrates.

The Q-Q Plot

The Normal Q-Q plot (quantile-quantile plot) graphs the expected values from a normal distribution on the X-axis against the observed values from your data on the Y-axis. If the data is perfectly normal, all points fall on the diagonal line.

Normal Q-Q plots in SPSS comparing a normally distributed variable versus a right-skewed variable

Figure 5: Q-Q plots side by side: Satisfaction (left) follows the diagonal closely, indicating approximate normality; StudyHours (right) curves away at the upper end, indicating right skew

How to interpret Q-Q plots:

Points on or near the line: Data is approximately normal. Minor scatter around the line is expected and acceptable.
Systematic curve (S-shape or banana shape): Data is skewed. An upward curve at the right end indicates right skew (as with StudyHours). A downward curve at the left end indicates left skew.
Points departing at both ends: Heavy tails (leptokurtic distribution). Points fall below the line at the left and above it at the right.
Individual points far from the line: Potential outliers. If only one or two points at the extremes deviate substantially, these are likely outliers rather than evidence of non-normality in the overall distribution.

Q-Q plots are particularly valuable for large samples where the Shapiro-Wilk test may be overly sensitive. A Q-Q plot that looks reasonably straight is strong evidence that your data is acceptable for parametric tests, even if the formal test produces a significant p-value.

Histograms with Normal Curve

Histograms provide an intuitive view of your distribution's shape. SPSS overlays a normal curve on the histogram when you request it through the Explore procedure.

SPSS histograms with normal curve overlay comparing symmetric and right-skewed distributions

Figure 6: Histograms side by side: Satisfaction (left) shows a roughly symmetric distribution; StudyHours (right) is visibly right-skewed with a long tail

Histograms are best used for initial screening rather than formal assessment. They are affected by bin width (the number of bars), and with small samples the shape can look irregular even when the underlying distribution is normal. Use them alongside Q-Q plots and formal tests, not as your sole method.

Skewness and Kurtosis Values

The Explore procedure includes skewness and kurtosis in the Descriptives table (the same table that contains means and standard deviations). If you already ran descriptive statistics on these variables, you have these values from the earlier output.

Variable	Skewness (SE)	Kurtosis (SE)	Interpretation
Satisfaction	-0.477 (0.203)	-0.409 (0.404)	Within +/-1: approximately normal
StudyHours	1.392 (0.203)	3.053 (0.404)	Skewness > 1: right-skewed; kurtosis > 2: heavy tails
GPA	-0.161 (0.203)	-0.718 (0.404)	Within +/-1: approximately normal
Age	0.415 (0.203)	-0.500 (0.404)	Within +/-1: approximately normal

Table 2: Skewness and kurtosis values from the Explore descriptive output

Interpretation thresholds:

Range	Skewness Interpretation	Kurtosis Interpretation	Action
-1 to +1	Approximately symmetric	Approximately normal tails	Safe for parametric tests
-2 to -1 or +1 to +2	Moderately skewed	Moderately non-normal tails	Usually acceptable, especially with n > 30
Beyond +/-2	Substantially skewed	Substantially non-normal tails	Consider non-parametric tests or transformations

Table 3: Skewness and kurtosis interpretation thresholds for normality assessment

Note that these thresholds are conventions, not absolute rules. Different sources recommend different cutoffs. George and Mallery (2019) use +/-1 for "excellent" and +/-2 for "acceptable." Hair et al. (2019) consider values beyond +/-1.96 as significant at the .05 level for samples over 50. The key point is to state which threshold you applied in your thesis and apply it consistently across all variables.

What to Do When Normality Fails

When one or more variables fail normality testing, you have three options. The best choice depends on how severely the distribution deviates and which analysis you plan to run.

Option 1: Use Non-Parametric Alternatives

Non-parametric tests do not assume normality and are the most straightforward solution when the violation is severe.

Parametric Test	Non-Parametric Alternative	When to Switch
Independent samples t-test	Mann-Whitney U test	Severe skewness or small sample with non-normal data
Paired samples t-test	Wilcoxon signed-rank test	Severe skewness in the difference scores
One-way ANOVA	Kruskal-Wallis H test	Severe skewness in one or more groups
Pearson correlation	Spearman rank correlation	Non-linear relationship or ordinal data

Table 4: Parametric tests and their non-parametric alternatives

Option 2: Transform the Data

Data transformations can reduce skewness and bring a variable closer to a normal distribution. Common transformations in SPSS:

For right-skewed data (positive skewness):

Log transformation: Transform > Compute Variable > LN(variable). Best for moderate right skew.
Square root transformation: Transform > Compute Variable > SQRT(variable). Milder correction than log.

For left-skewed data (negative skewness):

Reflect and log: Transform > Compute Variable > LN(K - variable) where K is one more than the maximum value.

After transforming, re-run the normality tests on the transformed variable. If it passes, use the transformed variable in your analysis. Report both the original descriptive statistics (for interpretability) and note that the transformed variable was used in the inferential tests.

The downside of transformations is that interpretation becomes less intuitive. A log-transformed satisfaction score does not have the same straightforward meaning as the original scale.

Option 3: Proceed with Parametric Tests (with Justification)

If the normality violation is mild to moderate, you can often proceed with parametric tests and provide justification. This approach is supported by research showing that t-tests and ANOVA are robust to non-normality with samples above 30 (Schmider et al., 2010).

Your justification in the thesis should include:

The specific skewness/kurtosis values (demonstrating the violation is moderate)
The sample size (larger samples make parametric tests more robust)
A citation supporting the robustness claim (e.g., Field, 2018; Glass et al., 1972)
Visual evidence (Q-Q plot showing the deviation is minor)

This is often the most practical approach for thesis work. In our example, Satisfaction has a skewness of -0.48 and the Shapiro-Wilk test was significant (p = .003), but the skewness is well within the -1 to +1 range, and the Q-Q plot confirms approximate normality. Even StudyHours with a skewness of 1.39 does not necessarily invalidate a t-test with 142 cases, though the kurtosis of 3.05 warrants more caution.

Reporting Normality in APA Format

In-Text Reporting

When the Shapiro-Wilk test is significant but skewness/kurtosis are acceptable:

Normality was assessed using the Shapiro-Wilk test, visual inspection of Q-Q plots, and skewness/kurtosis values. Although the Shapiro-Wilk test was significant for Satisfaction, W(142) = .969, p = .003, the skewness (-0.48, SE = 0.20) and kurtosis (-0.41, SE = 0.40) fell within the -1 to +1 range, and the Q-Q plot showed no systematic deviation from the diagonal. Given the sensitivity of the Shapiro-Wilk test with samples above 100 and the robustness of parametric tests to minor normality violations (Schmider et al., 2010), parametric analysis was retained.

When reporting multiple variables with acceptable skewness/kurtosis:

Shapiro-Wilk tests were significant for Satisfaction (W = .969, p = .003), GPA (W = .976, p = .014), and Age (W = .954, p < .001). However, skewness and kurtosis values for all three variables fell within the acceptable range of -1 to +1 (see Table X), and Q-Q plots indicated approximate normality. These results are consistent with the known sensitivity of the Shapiro-Wilk test in samples exceeding 100 cases (Field, 2018).

When normality is genuinely violated:

The Shapiro-Wilk test indicated that StudyHours was not normally distributed, W(142) = .910, p < .001, with a skewness of 1.39 (SE = 0.20) and kurtosis of 3.05 (SE = 0.40). Visual inspection of the Q-Q plot confirmed right skewness, with data points curving away from the diagonal at the upper end. Consequently, the Mann-Whitney U test was used instead of the independent samples t-test for group comparisons involving this variable.

When proceeding despite a moderate violation:

Although the Shapiro-Wilk test was significant for StudyHours, W(142) = .910, p < .001, the skewness (1.39) fell within the -2 to +2 range that Hair et al. (2019) consider acceptable. Given the sample size (N = 142) and the known robustness of the independent samples t-test to moderate normality violations (Schmider et al., 2010), parametric analysis was retained.

APA-Formatted Normality Summary Table

Variable	W	p	Skewness (SE)	Kurtosis (SE)	Decision
Satisfaction	.969	.003	-0.48 (0.20)	-0.41 (0.40)	Acceptable (skewness/kurtosis within +/-1)
StudyHours	.910	< .001	1.39 (0.20)	3.05 (0.40)	Non-normal (skewness > 1, kurtosis > 2)
GPA	.976	.014	-0.16 (0.20)	-0.72 (0.40)	Acceptable (skewness/kurtosis within +/-1)
Age	.954	< .001	0.42 (0.20)	-0.50 (0.40)	Acceptable (skewness/kurtosis within +/-1)

Table 5: Summary of normality assessment results from the Explore procedure

This table demonstrates a common real-world scenario: all Shapiro-Wilk tests are significant, but only StudyHours has skewness and kurtosis values that indicate a genuine problem. For the other three variables, the "Decision" column reflects the practical assessment based on skewness/kurtosis thresholds and Q-Q plot inspection, not the Shapiro-Wilk p-value alone.

Common Mistakes When Checking Normality in SPSS

1. Relying Solely on the Shapiro-Wilk p-Value

As the example in this guide demonstrates, all four variables produced significant Shapiro-Wilk results, yet only StudyHours had a genuine normality problem. Always supplement the formal test with Q-Q plots and skewness/kurtosis values.

2. Testing the Wrong Thing

For regression, the normality assumption applies to the residuals, not the raw variables. For t-tests and ANOVA, it applies within each group. Running a global normality test on the entire sample of a variable without considering the analysis context can lead to incorrect conclusions. Test normality at the level that matches your planned analysis.

3. Using Kolmogorov-Smirnov as the Primary Test

Many older textbooks and online tutorials recommend the K-S test as the default. The Kolmogorov-Smirnov test has substantially lower statistical power than the Shapiro-Wilk test, meaning it is more likely to miss genuine non-normality (a Type II error). Use Shapiro-Wilk as your primary test unless your committee specifically requires K-S.

4. Ignoring Visual Evidence

Some students report "the Shapiro-Wilk test was non-significant, so normality is assumed" without looking at the actual distribution. A non-significant test with a small sample might simply mean the test lacked power. Always inspect the Q-Q plot and histogram. If the visual methods reveal clear problems (strong curvature, bimodality, extreme outliers), investigate further regardless of the p-value.

5. Applying a One-Size-Fits-All Threshold

Using +/-2 for skewness as an absolute cutoff without considering sample size or the specific analysis is overly rigid. With a sample of 15, even skewness of 1.5 might be problematic. With a sample of 500, skewness of 1.5 may have negligible impact on your t-test results. Report the values and justify your decision based on the specific context.

6. Forgetting to Test Normality by Group

When comparing groups (independent t-test, one-way ANOVA), the normality assumption applies within each group. Testing the overall distribution of the variable (ignoring groups) can produce misleading results. Two non-normal groups can combine into what appears to be a normal distribution, and vice versa. Use the Factor List in the Explore procedure to test each group separately.

7. Transforming Data Without Re-Checking

After applying a log or square root transformation, some students proceed to their analysis without verifying that the transformation actually worked. Always re-run the normality tests on the transformed variable. If the transformation does not achieve approximate normality, a non-parametric test may be more appropriate than stacking multiple transformations.

What Your Thesis Committee Will Ask

"How did you assess normality?" The strongest answer demonstrates multiple methods: "I assessed normality using the Shapiro-Wilk test, visual inspection of Q-Q plots, and skewness/kurtosis values. The Shapiro-Wilk test was non-significant for all variables, the Q-Q plots showed no systematic deviation from the diagonal, and skewness and kurtosis values fell within the -1 to +1 range." This shows thoroughness rather than reliance on a single indicator.

"Why did you use the Shapiro-Wilk test rather than the Kolmogorov-Smirnov test?" Explain that the Shapiro-Wilk test has higher statistical power, meaning it is better at detecting non-normality. It is recommended by Field (2018) and Razali and Wah (2011) as the preferred test for samples up to 2,000. SPSS outputs both tests simultaneously, so you can report K-S alongside Shapiro-Wilk if the committee prefers.

"Your Shapiro-Wilk test was significant. Why did you still use a parametric test?" This question comes up frequently. Cite the specific skewness and kurtosis values, reference the robustness literature (Schmider et al., 2010), point to the Q-Q plot, and note your sample size. A well-prepared answer addresses both the statistical evidence and the methodological justification.

"What would you have done differently if normality had been severely violated?" Demonstrate that you considered alternatives: "If the skewness exceeded +/-2 or the Q-Q plot showed systematic curvature, I would have considered data transformation (log or square root) and re-tested normality. If transformation was unsuccessful, I would have used the non-parametric equivalent, such as the Mann-Whitney U test instead of the independent samples t-test."

"Do you need to test normality for each group separately?" For analyses comparing groups (t-tests, ANOVA), the answer is yes. Explain that you used the Factor List in the Explore procedure to produce separate normality tests per group. This demonstrates awareness that the normality assumption applies within groups, not globally.

Frequently Asked Questions

Next Steps After Normality Testing

With normality assessed, your next step depends on what you found:

If all variables passed normality testing, proceed to the appropriate parametric test for your research design. For comparing two group means, run an independent samples t-test or paired samples t-test. For three or more groups, run a one-way ANOVA or repeated measures ANOVA. For relationships between continuous variables, run a Pearson correlation or linear regression.

If one or more variables failed normality testing with severe deviations (skewness beyond +/-2), consider the three options covered above: non-parametric alternatives, data transformations, or proceeding with justification. Your choice should be guided by the severity of the violation, your sample size, and your committee's expectations.

If you have not yet checked the reliability of any multi-item scales in your dataset, do that before running inferential tests. See our guide on Cronbach's Alpha in SPSS.

Regardless of the normality outcome, document your assessment thoroughly. A well-documented normality section shows your committee that you understand the assumptions behind your chosen tests, not just the mechanics of running them.

References

American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). American Psychological Association.

Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). SAGE Publications.

George, D., & Mallery, P. (2019). IBM SPSS statistics 26 step by step: A simple guide and reference (16th ed.). Routledge.

Glass, G. V., Peckham, P. D., & Sanders, J. R. (1972). Consequences of failure to meet assumptions underlying the fixed effects analyses of variance and covariance. Review of Educational Research, 42(3), 237-288.

Hair, J. F., Babin, B. J., Anderson, R. E., & Black, W. C. (2019). Multivariate data analysis (8th ed.). Cengage Learning.

Pallant, J. (2020). SPSS survival manual (7th ed.). Open University Press.

Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21-33.

Schmider, E., Ziegler, M., Danay, E., Beyer, L., & Bühner, M. (2010). Is it really robust? Reinvestigating the robustness of ANOVA against violations of the normal distribution assumption. Methodology, 6(4), 147-151.

Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.