Non-Parametric Statistical Tests

Statistical tools for analyzing data that doesn’t meet the assumptions of parametric tests. StatFusion’s non-parametric calculators provide robust analysis options for non-normal data, ordinal measurements, or small samples.

Tip

Non-parametric tests make fewer assumptions about your data than their parametric counterparts. They’re essential when working with data that doesn’t follow a normal distribution, contains outliers, uses ordinal scales, or comes from small samples. These distribution-free methods provide valid statistical inference even when parametric assumptions are violated.

What Are Non-Parametric Tests?

Non-parametric tests (also called distribution-free methods) are statistical procedures that don’t require specific assumptions about the underlying population distribution. Unlike parametric tests like t-tests and ANOVA that assume normal distribution, non-parametric tests typically:

Work with ranks rather than raw values
Make fewer assumptions about the shape of distributions
Are robust to outliers and skewed data
Can be used with ordinal data and small samples
Maintain valid results when parametric assumptions are violated

While non-parametric tests are generally less powerful than their parametric equivalents when parametric assumptions are met, they provide more reliable results when those assumptions are violated.

When to Use Non-Parametric Tests

Non-parametric methods are appropriate in several situations:

Non-normal data: When your data significantly deviates from normal distribution
Small sample sizes: When you have too few observations to verify normality
Ordinal data: When data is measured on an ordinal scale (rankings, Likert scales)
Presence of outliers: When data contains extreme values that can’t be removed
Heterogeneous variances: When group variances differ substantially

Each non-parametric test is the alternative to a specific parametric test, designed for similar research questions but with fewer assumptions about the data.

Available Non-Parametric Tests

StatFusion offers a comprehensive suite of non-parametric tests organized by purpose and design.

One-Sample Tests

Tests for comparing a single sample to a hypothesized value or distribution.

Median Tests

Compare sample median to a hypothesized value:

Wilcoxon Signed-Rank Test (Single) - Non-parametric alternative to one-sample t-test
Sign Test - Test if median differs from a hypothesized value

Distribution Tests

Compare sample distribution to a theoretical distribution:

Kolmogorov-Smirnov One-Sample Test - Compare sample to theoretical distribution
Chi-Square Goodness-of-Fit Test - Test if categorical data follows expected proportions

Wilcoxon Signed-Rank Test vs. Sign Test

Both tests compare a sample to a hypothesized median, but they differ in how they use the data:

Wilcoxon Signed-Rank Test: - Uses both the sign and magnitude of differences - More powerful when the distribution is symmetric - Non-parametric alternative to the one-sample t-test - Assumes symmetric distribution of differences (though not necessarily normal)

Sign Test: - Uses only the sign (direction) of differences, not their magnitude - Less powerful but makes fewer assumptions - More robust when the distribution is asymmetric - Appropriate when you can only determine if values are higher or lower than median

The Wilcoxon test is generally preferred unless the distribution is highly skewed or you’re working with ordinal data where only the direction (not magnitude) can be determined.

Two-Sample Tests

Tests for comparing two samples or groups.

Independent Samples Tests

For comparing two unrelated groups:

Mann-Whitney U Test - Non-parametric alternative to independent t-test
Wilcoxon Rank-Sum Test - Equivalent to the Mann-Whitney U test
Kolmogorov-Smirnov Two-Sample Test - Compare shapes of distributions
Mood’s Median Test - Compare medians with minimal assumptions

Paired Samples Tests

For comparing two related measurements:

Wilcoxon Signed-Rank Test (Paired) - Non-parametric alternative to paired t-test
Sign Test (Paired) - Simple test for paired differences
McNemar’s Test - Compare paired proportions (binary outcomes)

Mann-Whitney U Test vs. Wilcoxon Rank-Sum Test

The Mann-Whitney U test and Wilcoxon Rank-Sum test are mathematically equivalent procedures with different historical origins and slightly different computational approaches. Both:

Compare distributions between two independent groups
Use ranks rather than raw values
Serve as non-parametric alternatives to the independent samples t-test
Test whether one group tends to have larger values than the other

These tests are suitable when: - Data doesn’t meet normality assumptions - Sample sizes are small - Data contains outliers - You’re working with ordinal data

The default interpretation tests whether one population tends to have larger values than the other, though additional assumptions allow testing for differences in medians specifically.

Interpreting the Wilcoxon Signed-Rank Test for Paired Samples

The Wilcoxon Signed-Rank Test for paired samples is the non-parametric equivalent of the paired samples t-test. It’s used when:

You have two related measurements (before/after, matched pairs)
The differences don’t follow a normal distribution
Your sample size is small
The data contains outliers

The test works by: 1. Calculating the differences between paired measurements 2. Ranking the absolute differences 3. Summing the ranks for positive and negative differences 4. Using the smaller sum as the test statistic

A significant result indicates that one measurement tends to be consistently higher or lower than the other. While it doesn’t compare means specifically (like the t-test), it tells you about the consistent direction and magnitude of differences between paired observations.

Multiple-Sample Tests

Tests for comparing three or more groups or samples.

Independent Groups Tests

For comparing multiple unrelated groups:

Kruskal-Wallis Test - Non-parametric alternative to one-way ANOVA
Jonckheere-Terpstra Test - Test for ordered alternatives across groups
Median Test - Compare medians across multiple groups

Kruskal-Wallis Test: When and How to Use It

The Kruskal-Wallis test is the non-parametric alternative to one-way ANOVA, used when: - Comparing three or more independent groups - The data doesn’t follow normal distribution - There are outliers that can’t be removed - The data is ordinal or the variances are unequal

How it works: 1. All observations are ranked from lowest to highest, ignoring group membership 2. The ranks are summed within each group 3. The test statistic (H) is calculated based on these rank sums 4. A significant result indicates that at least one group differs from the others

Like ANOVA, a significant Kruskal-Wallis result doesn’t tell you which specific groups differ. You’ll need to perform post-hoc tests (like Dunn’s test) to identify the specific differences between groups.

The Kruskal-Wallis test is particularly useful in biomedical research, psychology, ecology, and other fields where data often doesn’t meet parametric assumptions.

Friedman Test: For Repeated Measurements

The Friedman test is the non-parametric alternative to repeated measures ANOVA, used when: - The same subjects are measured under multiple conditions or time points - The data doesn’t follow normal distribution - The sample size is small - The measurements are ordinal

How it works: 1. Data is arranged in a table with subjects as rows and conditions as columns 2. Values are ranked across conditions separately for each subject 3. The test statistic is calculated based on the sums of ranks for each condition 4. A significant result indicates differences across conditions

The Friedman test is commonly used in: - Clinical trials measuring patient responses across multiple time points - Sensory evaluation studies comparing multiple products - Behavioral studies with repeated measurements under different conditions - Educational research tracking performance across different methods

Follow up a significant Friedman test with post-hoc tests (like Nemenyi’s test) to identify specific differences between conditions.

Post-Hoc Tests for Non-Parametric Analyses

Follow-up tests after finding significant differences in multiple-group non-parametric tests.

Post-Hoc for Kruskal-Wallis

Follow-up tests after significant Kruskal-Wallis results:

Dunn’s Test - Pairwise comparisons after Kruskal-Wallis
Conover-Iman Test - More powerful alternative to Dunn’s test
Steel-Dwass-Critchlow-Fligner Test - Controls familywise error rate

Post-Hoc for Friedman

Follow-up tests after significant Friedman results:

Nemenyi Test - Pairwise comparisons after Friedman
Conover Test for Friedman - More powerful than Nemenyi
Wilcoxon Signed-Rank with Bonferroni - Multiple pairwise comparisons

Correlation and Association Tests

Non-parametric tests for measuring relationships between variables.

Correlation Tests

Measure monotonic relationships:

Spearman’s Rank Correlation - Non-parametric measure of association based on ranks
Kendall’s Tau - Alternative rank correlation with different properties
Goodman-Kruskal Gamma - Measure of association for ordinal variables

Categorical Association Tests

Test relationships between categorical variables:

Chi-Square Test - Test independence between categorical variables
Fisher’s Exact Test - Alternative for small expected frequencies
Cramer’s V - Effect size for Chi-Square tests

Spearman vs. Kendall Correlation: Which to Choose?

Both Spearman’s rank correlation (ρ) and Kendall’s Tau (τ) are non-parametric measures of association based on ranks, but they have different properties:

Spearman’s Correlation:

More widely used and recognized
Easier to calculate and interpret (similar to Pearson’s r)
Measures the strength of monotonic relationships
More sensitive to errors and outliers

Kendall’s Tau:

More robust to outliers and errors in data
Better statistical properties for small samples
More accurate p-values for non-normal distributions
Better interpretation for ordinal data
More directly interpretable as probability of concordance minus probability of discordance

When to choose Kendall’s Tau:

When you have small sample sizes
When you have many tied ranks
When robustness is a primary concern
When you’re working with ordinal data

When to choose Spearman’s: - When communicating to audiences familiar with correlation - When computational simplicity is important - When you want direct comparison with Pearson’s r

In practice, both measures often lead to similar conclusions about statistical significance, though the numerical values differ.

How to Choose the Right Non-Parametric Test

Selecting the appropriate non-parametric test depends on your research question and study design:

Tip

Decision Guide for Non-Parametric Tests

Identify your research question and study design (comparison, relationship, etc.)
Determine what parametric test would be appropriate if assumptions were met
Select the corresponding non-parametric alternative based on your data characteristics

For Comparing Groups/Samples

Instead of One-Sample t-Test:

Wilcoxon Signed-Rank Test (Single)
Sign Test (for highly skewed distributions)

Instead of Independent Samples t-Test:

Instead of Paired Samples t-Test:

Wilcoxon Signed-Rank Test (Paired)
Sign Test (Paired) (for highly skewed distributions)

Instead of One-Way ANOVA:

Kruskal-Wallis Test
Median Test (more robust but less powerful)

Instead of Repeated Measures ANOVA:

Friedman Test
Cochran’s Q Test (for binary outcomes)

For Measuring Relationships

Instead of Pearson Correlation:

Spearman’s Rank Correlation
Kendall’s Tau

Instead of Linear Regression:

Theil-Sen Estimator
Quantile Regression

For Categorical Variables:

Chi-Square Test (large samples)
Fisher’s Exact Test (small samples)
McNemar’s Test (paired proportions)

For Survival Analysis:

Log-Rank Test
Cox Proportional Hazards (semi-parametric)

Advantages and Limitations of Non-Parametric Tests

Advantages

Fewer assumptions about the underlying distribution
Robust to outliers and extreme values
Applicable to ordinal data (rankings, Likert scales)
Valid for small samples where normality is difficult to verify
Simple interpretations often based on medians or ranks
Useful when transformations fail to normalize data

Limitations

Less statistical power when parametric assumptions are actually met
Less precise confidence intervals
Limited multivariate methods compared to parametric statistics
Reduced ability to control for covariates
Less familiar to many readers of research
Testing different hypotheses than parametric equivalents in some cases

Common Questions About Non-Parametric Tests

Are non-parametric tests always less powerful than parametric tests?

Non-parametric tests are generally less powerful than their parametric equivalents when all parametric assumptions are met. However, when assumptions are violated, non-parametric tests can be more powerful because:

They maintain accurate Type I error rates
They’re less influenced by outliers and extreme values
They can detect differences in distribution shape, not just central tendency

For data that moderately violates normality with large sample sizes, parametric tests remain robust due to the Central Limit Theorem. However, for small samples, substantial non-normality, or ordinal data, non-parametric tests often provide better statistical inference.

In practice, the decision should consider both the nature of your data and the specific research question you’re addressing.

Do non-parametric tests test the same hypotheses as parametric tests?

Not exactly. Parametric and non-parametric tests often test different aspects of the data:

Parametric tests typically compare means:

t-test: difference in means between groups
ANOVA: differences in means across multiple groups

Non-parametric tests typically compare:

Medians (in some cases)
Overall distributions (stochastic dominance)
Probability that a random value from one group exceeds a random value from another

For example, the Mann-Whitney U test doesn’t directly test for differences in medians (contrary to common belief) unless you assume identical distribution shapes. Instead, it tests whether one distribution is stochastically greater than the other.

These distinction matters for interpretation. A significant non-parametric result might not necessarily indicate a difference in medians, but rather that values in one group tend to be larger than in the other.

How do I report non-parametric test results?

For APA style (7th edition), report:

Mann-Whitney U test:

A Mann-Whitney U test indicated that [variable] was significantly [higher/lower] for [Group 1] (Median = [value]) than for [Group 2] (Median = [value]), U = [value], p = [value], r = [effect size].

Wilcoxon Signed-Rank test:

A Wilcoxon signed-rank test showed that [intervention/condition] elicited a statistically significant change in [variable] (Z = [value], p = [value], r = [effect size]), with median [increasing/decreasing] from [value] to [value].

Kruskal-Wallis test:

A Kruskal-Wallis test showed a statistically significant difference in [variable] between the different [groups/conditions], H([df]) = [value], p = [value], with a mean rank of [value] for [Group 1], [value] for [Group 2], and [value] for [Group 3].

Always include:

Test name and statistic
p-value
Effect size when possible
Descriptive statistics (typically medians rather than means)
Relevant degrees of freedom
Post-hoc test results if applicable

What effect sizes should I report with non-parametric tests?

Common effect size measures for non-parametric tests include:

For Mann-Whitney U or Wilcoxon Rank-Sum test:

r = Z / √N (where Z is the standardized test statistic and N is the total sample size)
- Small effect: r ≈ 0.1
- Medium effect: r ≈ 0.3
- Large effect: r ≈ 0.5
Probability of superiority: Probability that a random observation from one group exceeds a random observation from the other

For Wilcoxon Signed-Rank test:

r = Z / √N (where N is the total number of observations)
Matched-pairs rank biserial correlation

For Kruskal-Wallis test:

η²H = (H - k + 1) / (n - k) (where H is the test statistic, k is the number of groups, n is the total sample size)
ε² = H / (n² - 1) ÷ n

For Friedman test:

Kendall’s W (coefficient of concordance)
Friedman’s χ²r / (n(k-1)) (where n is the number of subjects and k is the number of conditions)

These effect sizes help quantify the magnitude of observed effects, complementing p-values by indicating practical significance.

Can I use non-parametric tests for all my analyses to be safe?

While it might seem conservative to use non-parametric tests by default, this approach has important limitations:

Potential drawbacks:

Lower statistical power when parametric assumptions are met
More limited ability to handle complex designs (multiple factors, covariates)
Less precise confidence intervals
Testing different hypotheses than parametric equivalents in some cases
Less familiar to many readers and reviewers

A better approach:

Check if your data reasonably meets parametric assumptions
Use parametric tests when appropriate, as they often provide more informative results
Reserve non-parametric tests for cases where assumptions are clearly violated
Consider transformations to normalize data when possible
Report the rationale for your test selection

In modern practice, many statisticians recommend using:

Welch’s t-test (robust to unequal variances) rather than Student’s t-test
Bootstrapping or permutation tests (computationally intensive but highly flexible)
Mixed models (can handle various data structures)

These approaches often provide good alternatives that balance robustness and statistical power.

Reuse

CC BY-NC-SA 4.0

Citation

BibTeX citation:

@online{kassambara2025,
  author = {Kassambara, Alboukadel},
  title = {Non-Parametric {Statistical} {Tests} \textbar{}
    {Distribution-Free} {Methods}},
  date = {2025-04-10},
  url = {https://www.datanovia.com/apps/statfusion/analysis/inferential/non-parametric/index.html},
  langid = {en}
}

For attribution, please cite this work as:

Kassambara, Alboukadel. 2025. “Non-Parametric Statistical Tests | Distribution-Free Methods.” April 10, 2025. https://www.datanovia.com/apps/statfusion/analysis/inferential/non-parametric/index.html.