{"id":11688,"date":"2019-12-26T10:01:43","date_gmt":"2019-12-26T08:01:43","guid":{"rendered":"https:\/\/www.datanovia.com\/en\/?post_type=dt_lessons&#038;p=11688"},"modified":"2019-12-26T10:09:12","modified_gmt":"2019-12-26T08:09:12","slug":"independent-t-test-assumptions","status":"publish","type":"dt_lessons","link":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/","title":{"rendered":"Independent T-Test Assumptions"},"content":{"rendered":"<div id=\"rdoc\">\n<p>This article describes the <strong>independent t-test assumptions<\/strong> and provides examples of R code to check whether the assumptions are met before calculating the t-test. This also referred as the <em>two sample t test assumptions<\/em>.<\/p>\n<p>The independent samples t-test comes in two different forms:<\/p>\n<ul>\n<li>the standard <em>Student\u2019s t-test<\/em>, which assumes that the variance of the two groups are equal.<\/li>\n<li>the <em>Welch\u2019s t-test<\/em>, which is less restrictive compared to the original Student\u2019s test. This is the test where you do not assume that the variance is the same in the two groups, which results in the fractional degrees of freedom.<\/li>\n<\/ul>\n<div class=\"warning\">\n<p>The two methods give very similar results unless both the group sizes and the standard deviations are very different.<\/p>\n<\/div>\n<p>Contents:<\/p>\n<div id=\"TOC\">\n<ul>\n<li><a href=\"#assumptions\">Assumptions<\/a><\/li>\n<li><a href=\"#check-independent-t-test-assumptions-in-r\">Check independent t-test assumptions in R<\/a>\n<ul>\n<li><a href=\"#prerequisites\">Prerequisites<\/a><\/li>\n<li><a href=\"#demo-data\">Demo data<\/a><\/li>\n<li><a href=\"#identify-outliers\">Identify outliers<\/a><\/li>\n<li><a href=\"#check-normality-by-groups\">Check normality by groups<\/a><\/li>\n<li><a href=\"#check-the-equality-of-variances\">Check the equality of variances<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#related-article\">Related article<\/a><\/li>\n<\/ul>\n<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div class='dt-sc-ico-content type1'><div class='custom-icon' ><a href='https:\/\/www.datanovia.com\/en\/product\/practical-statistics-in-r-for-comparing-groups-numerical-variables\/' target='_blank'><span class='fa fa-book'><\/span><\/a><\/div><h4><a href='https:\/\/www.datanovia.com\/en\/product\/practical-statistics-in-r-for-comparing-groups-numerical-variables\/' target='_blank'> Related Book <\/a><\/h4>Practical Statistics in R II - Comparing Groups: Numerical Variables<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div id=\"assumptions\" class=\"section level2\">\n<h2>Assumptions<\/h2>\n<p>The two-samples independent t-test assume the following characteristics about the data:<\/p>\n<ul>\n<li><strong>Independence of the observations<\/strong>. Each subject should belong to only one group. There is no relationship between the observations in each group.<\/li>\n<li><strong>No significant outliers<\/strong> in the two groups<\/li>\n<li><strong>Normality<\/strong>. the data for each group should be approximately normally distributed.<\/li>\n<li><strong>Homogeneity of variances<\/strong>. the variance of the outcome variable should be equal in each group. Recall that, the Welch t-test does not make this assumptions.<\/li>\n<\/ul>\n<p>In this section, we\u2019ll perform some preliminary tests to check whether these assumptions are met.<\/p>\n<\/div>\n<div id=\"check-independent-t-test-assumptions-in-r\" class=\"section level2\">\n<h2>Check independent t-test assumptions in R<\/h2>\n<div id=\"prerequisites\" class=\"section level3\">\n<h3>Prerequisites<\/h3>\n<p>Make sure you have installed the following R packages:<\/p>\n<ul>\n<li><code>tidyverse<\/code> for data manipulation and visualization<\/li>\n<li><code>ggpubr<\/code> for creating easily publication ready plots<\/li>\n<li><code>rstatix<\/code> provides pipe-friendly R functions for easy statistical analyses.<\/li>\n<li><code>datarium<\/code>: contains required data sets for this chapter.<\/li>\n<\/ul>\n<p>Start by loading the following required packages:<\/p>\n<pre class=\"r\"><code>library(tidyverse)\r\nlibrary(ggpubr)\r\nlibrary(rstatix)<\/code><\/pre>\n<\/div>\n<div id=\"demo-data\" class=\"section level3\">\n<h3>Demo data<\/h3>\n<p>Demo dataset: <code>genderweight<\/code> [in datarium package] containing the weight of 40 individuals (20 women and 20 men).<\/p>\n<p>Load the data and show some random rows by groups:<\/p>\n<pre class=\"r\"><code># Load the data\r\ndata(\"genderweight\", package = \"datarium\")\r\n# Show a sample of the data by group\r\nset.seed(123)\r\ngenderweight %&gt;% sample_n_by(group, size = 2)<\/code><\/pre>\n<pre><code>## # A tibble: 4 x 3\r\n##   id    group weight\r\n##   &lt;fct&gt; &lt;fct&gt;  &lt;dbl&gt;\r\n## 1 6     F       65.0\r\n## 2 15    F       65.9\r\n## 3 29    M       88.9\r\n## 4 37    M       77.0<\/code><\/pre>\n<\/div>\n<div id=\"identify-outliers\" class=\"section level3\">\n<h3>Identify outliers<\/h3>\n<p>Outliers can be easily identified using boxplot methods, implemented in the R function <code>identify_outliers()<\/code> [rstatix package].<\/p>\n<pre class=\"r\"><code>genderweight %&gt;%\r\n  group_by(group) %&gt;%\r\n  identify_outliers(weight)<\/code><\/pre>\n<pre><code>## # A tibble: 2 x 5\r\n##   group id    weight is.outlier is.extreme\r\n##   &lt;fct&gt; &lt;fct&gt;  &lt;dbl&gt; &lt;lgl&gt;      &lt;lgl&gt;     \r\n## 1 F     20      68.8 TRUE       FALSE     \r\n## 2 M     31      95.1 TRUE       FALSE<\/code><\/pre>\n<div class=\"success\">\n<p>There were no extreme outliers.<\/p>\n<\/div>\n<div class=\"warning\">\n<p>Note that, in the situation where you have extreme outliers, this can be due to: 1) data entry errors, measurement errors or unusual values.<\/p>\n<p>Yo can include the outlier in the analysis anyway if you do not believe the result will be substantially affected. This can be evaluated by comparing the result of the t-test with and without the outlier.<\/p>\n<p>It\u2019s also possible to keep the outliers in the data and perform Wilcoxon test or robust t-test using the WRS2 package.<\/p>\n<\/div>\n<\/div>\n<div id=\"check-normality-by-groups\" class=\"section level3\">\n<h3>Check normality by groups<\/h3>\n<p>The normality assumption can be checked by computing the Shapiro-Wilk test for each group. If the data is normally distributed, the p-value should be greater than 0.05.<\/p>\n<pre class=\"r\"><code>genderweight %&gt;%\r\n  group_by(group) %&gt;%\r\n  shapiro_test(weight)<\/code><\/pre>\n<pre><code>## # A tibble: 2 x 4\r\n##   group variable statistic     p\r\n##   &lt;fct&gt; &lt;chr&gt;        &lt;dbl&gt; &lt;dbl&gt;\r\n## 1 F     weight       0.938 0.224\r\n## 2 M     weight       0.986 0.989<\/code><\/pre>\n<div class=\"success\">\n<p>From the output, the two p-values are greater than the significance level 0.05 indicating that the distribution of the data are not significantly different from the normal distribution. In other words, we can assume the normality.<\/p>\n<\/div>\n<p>You can also create QQ plots for each group. QQ plot draws the correlation between a given data and the normal distribution.<\/p>\n<pre class=\"r\"><code>ggqqplot(genderweight, x = \"weight\", facet.by = \"group\")<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/r-statistics-2-comparing-groups-means\/figures\/082-independent-t-test-assumptions-qqplot-1.png\" width=\"480\" \/><\/p>\n<div class=\"success\">\n<p>All the points fall approximately along the (45-degree) reference line, for each group. So we can assume normality of the data.<\/p>\n<\/div>\n<div class=\"warning\">\n<p>Note that, if your sample size is greater than 50, the normal QQ plot is preferred because at larger sample sizes the Shapiro-Wilk test becomes very sensitive even to a minor deviation from normality.<\/p>\n<p>Note that, in the situation where the data are not normally distributed, it\u2019s recommended to use the non parametric two-samples Wilcoxon test.<\/p>\n<\/div>\n<\/div>\n<div id=\"check-the-equality-of-variances\" class=\"section level3\">\n<h3>Check the equality of variances<\/h3>\n<p>This can be done using the Levene\u2019s test. If the variances of groups are equal, the p-value should be greater than 0.05.<\/p>\n<pre class=\"r\"><code>genderweight %&gt;% levene_test(weight ~ group)<\/code><\/pre>\n<pre><code>## # A tibble: 1 x 4\r\n##     df1   df2 statistic      p\r\n##   &lt;int&gt; &lt;int&gt;     &lt;dbl&gt;  &lt;dbl&gt;\r\n## 1     1    38      6.12 0.0180<\/code><\/pre>\n<div class=\"success\">\n<p>The p-value of the Levene\u2019s test is significant, suggesting that there is a significant difference between the variances of the two groups. Therefore, we\u2019ll use the Welch t-test, which doesn\u2019t assume the equality of the two variances.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"related-article\" class=\"section level2\">\n<h2>Related article<\/h2>\n<p><a href=\"\/?p=10861\">T-test in R<\/a><\/p>\n<\/div>\n<\/div>\n<p><!--end rdoc--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Describes the independent t-test assumptions and provides examples of R code to check whether the assumptions are met before calculating the t-test.<\/p>\n","protected":false},"author":1,"featured_media":9152,"parent":11667,"menu_order":82,"comment_status":"open","ping_status":"closed","template":"","class_list":["post-11688","dt_lessons","type-dt_lessons","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Independent T-Test Assumptions : The Best Tutorial to Read - Datanovia<\/title>\n<meta name=\"description\" content=\"Describes the independent t-test assumptions and provides examples of R code to check whether the assumptions are met before calculating the t-test.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Independent T-Test Assumptions : The Best Tutorial to Read - Datanovia\" \/>\n<meta property=\"og:description\" content=\"Describes the independent t-test assumptions and provides examples of R code to check whether the assumptions are met before calculating the t-test.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/\" \/>\n<meta property=\"og:site_name\" content=\"Datanovia\" \/>\n<meta property=\"article:modified_time\" content=\"2019-12-26T08:09:12+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/\",\"name\":\"Independent T-Test Assumptions : The Best Tutorial to Read - Datanovia\",\"isPartOf\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg\",\"datePublished\":\"2019-12-26T08:01:43+00:00\",\"dateModified\":\"2019-12-26T08:09:12+00:00\",\"description\":\"Describes the independent t-test assumptions and provides examples of R code to check whether the assumptions are met before calculating the t-test.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#primaryimage\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg\",\"width\":1024,\"height\":512},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.datanovia.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lessons\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"T-Test Assumptions\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Independent T-Test Assumptions\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"name\":\"Datanovia\",\"description\":\"Data Mining and Statistics for Decision Support\",\"publisher\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.datanovia.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\",\"name\":\"Datanovia\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"width\":98,\"height\":99,\"caption\":\"Datanovia\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Independent T-Test Assumptions : The Best Tutorial to Read - Datanovia","description":"Describes the independent t-test assumptions and provides examples of R code to check whether the assumptions are met before calculating the t-test.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/","og_locale":"en_US","og_type":"article","og_title":"Independent T-Test Assumptions : The Best Tutorial to Read - Datanovia","og_description":"Describes the independent t-test assumptions and provides examples of R code to check whether the assumptions are met before calculating the t-test.","og_url":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/","og_site_name":"Datanovia","article_modified_time":"2019-12-26T08:09:12+00:00","og_image":[{"width":1024,"height":512,"url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/","url":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/","name":"Independent T-Test Assumptions : The Best Tutorial to Read - Datanovia","isPartOf":{"@id":"https:\/\/www.datanovia.com\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#primaryimage"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#primaryimage"},"thumbnailUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg","datePublished":"2019-12-26T08:01:43+00:00","dateModified":"2019-12-26T08:09:12+00:00","description":"Describes the independent t-test assumptions and provides examples of R code to check whether the assumptions are met before calculating the t-test.","breadcrumb":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#primaryimage","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040265.JPG.jpg","width":1024,"height":512},{"@type":"BreadcrumbList","@id":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/independent-t-test-assumptions\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.datanovia.com\/en\/"},{"@type":"ListItem","position":2,"name":"Lessons","item":"https:\/\/www.datanovia.com\/en\/lessons\/"},{"@type":"ListItem","position":3,"name":"T-Test Assumptions","item":"https:\/\/www.datanovia.com\/en\/lessons\/t-test-assumptions\/"},{"@type":"ListItem","position":4,"name":"Independent T-Test Assumptions"}]},{"@type":"WebSite","@id":"https:\/\/www.datanovia.com\/en\/#website","url":"https:\/\/www.datanovia.com\/en\/","name":"Datanovia","description":"Data Mining and Statistics for Decision Support","publisher":{"@id":"https:\/\/www.datanovia.com\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.datanovia.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.datanovia.com\/en\/#organization","name":"Datanovia","url":"https:\/\/www.datanovia.com\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","width":98,"height":99,"caption":"Datanovia"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/"}}]}},"multi-rating":{"mr_rating_results":[]},"_links":{"self":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/11688","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons"}],"about":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/types\/dt_lessons"}],"author":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/comments?post=11688"}],"version-history":[{"count":0,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/11688\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/11667"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media\/9152"}],"wp:attachment":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media?parent=11688"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}