{"id":11650,"date":"2019-12-25T12:24:26","date_gmt":"2019-12-25T10:24:26","guid":{"rendered":"https:\/\/www.datanovia.com\/en\/?post_type=dt_lessons&#038;p=11650"},"modified":"2020-05-22T11:10:53","modified_gmt":"2020-05-22T10:10:53","slug":"unpaired-t-test","status":"publish","type":"dt_lessons","link":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/","title":{"rendered":"Unpaired T-Test"},"content":{"rendered":"<p>&nbsp;<\/p>\n<div id=\"rdoc\">\n<p>The <strong>unpaired t-test<\/strong> is used to compare the mean of two independent groups. It\u2019s also known as:<\/p>\n<ul>\n<li><em>independent samples t-test<\/em>,<\/li>\n<li><em>independent t-test<\/em>,<\/li>\n<li><em>2 sample t test<\/em>,<\/li>\n<li><em>two sample t-test<\/em>,<\/li>\n<li><em>independent-measures t-test<\/em>,<\/li>\n<li><em>independent groups t test<\/em>,<\/li>\n<li><em>unpaired student t test<\/em>,<\/li>\n<li><em>between-subjects t-test<\/em> and<\/li>\n<li><em>Student\u2019s t-test<\/em>.<\/li>\n<\/ul>\n<p>For example, you might want to compare the average weights of individuals grouped by gender: male and female groups, which are two unrelated\/independent groups.<\/p>\n<div class=\"block\">\n<p>As a rule of thumb, your study should have six or more participants in each group in order to proceed with an unpaired t-test, but ideally you would have more. An independent-samples t-test will run with less than six participants, but your ability to generalize to a larger population will be more difficult.<\/p>\n<\/div>\n<p>The independent samples t-test comes in two different forms:<\/p>\n<ul>\n<li>the standard <em>Student\u2019s t-test<\/em>, which assumes that the variance of the two groups are equal.<\/li>\n<li>the <em>Welch\u2019s t-test<\/em>, which is less restrictive compared to the original Student\u2019s test. This is the test where you do not assume that the variance is the same in the two groups, which results in the fractional degrees of freedom.<\/li>\n<\/ul>\n<div class=\"warning\">\n<p>By default, R computes the Welch t-test, which is the safer one. The two methods give very similar results unless both the group sizes and the standard deviations are very different.<\/p>\n<\/div>\n<p>In this article, you will learn how to:<\/p>\n<ul>\n<li><em>Compute the independent samples t-test in R<\/em>. The pipe-friendly function <code>t_test()<\/code> [rstatix package] will be used.<\/li>\n<li><em>Check the independent-samples t-test assumptions<\/em><\/li>\n<li><em>Calculate and report the independent samples t-test effect size<\/em> using <em>Cohen\u2019s d<\/em>. The <code>d<\/code> statistic redefines the difference in means as the number of standard deviations that separates those means. T-test conventional effect sizes, proposed by Cohen, are: 0.2 (small effect), 0.5 (moderate effect) and 0.8 (large effect) <span class=\"citation\">(Cohen 1998)<\/span>.<\/li>\n<\/ul>\n<p>Contents:<\/p>\n<div id=\"TOC\">\n<ul>\n<li><a href=\"#prerequisites\">Prerequisites<\/a><\/li>\n<li><a href=\"#research-questions\">Research questions<\/a><\/li>\n<li><a href=\"#statistical-hypotheses\">Statistical hypotheses<\/a><\/li>\n<li><a href=\"#formula\">Formula<\/a><\/li>\n<li><a href=\"#demo-data\">Demo data<\/a><\/li>\n<li><a href=\"#summary-statistics\">Summary statistics<\/a><\/li>\n<li><a href=\"#visualization\">Visualization<\/a><\/li>\n<li><a href=\"#assumptions-and-preleminary-tests\">Assumptions and preleminary tests<\/a>\n<ul>\n<li><a href=\"#identify-outliers\">Identify outliers<\/a><\/li>\n<li><a href=\"#check-normality-by-groups\">Check normality by groups<\/a><\/li>\n<li><a href=\"#check-the-equality-of-variances\">Check the equality of variances<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#computation\">Computation<\/a><\/li>\n<li><a href=\"#effect-size\">Effect size<\/a>\n<ul>\n<li><a href=\"#cohens-d-for-student-t-test\">Cohen\u2019s d for Student t-test<\/a><\/li>\n<li><a href=\"#cohens-d-for-welch-t-test\">Cohen\u2019s d for Welch t-test<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#report\">Report<\/a><\/li>\n<li><a href=\"#summary\">Summary<\/a><\/li>\n<li><a href=\"#references\">References<\/a><\/li>\n<\/ul>\n<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div class='dt-sc-ico-content type1'><div class='custom-icon' ><a href='https:\/\/www.datanovia.com\/en\/product\/practical-statistics-in-r-for-comparing-groups-numerical-variables\/' target='_blank'><span class='fa fa-book'><\/span><\/a><\/div><h4><a href='https:\/\/www.datanovia.com\/en\/product\/practical-statistics-in-r-for-comparing-groups-numerical-variables\/' target='_blank'> Related Book <\/a><\/h4>Practical Statistics in R II - Comparing Groups: Numerical Variables<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div id=\"prerequisites\" class=\"section level2\">\n<h2>Prerequisites<\/h2>\n<p>Make sure you have installed the following R packages:<\/p>\n<ul>\n<li><code>tidyverse<\/code> for data manipulation and visualization<\/li>\n<li><code>ggpubr<\/code> for creating easily publication ready plots<\/li>\n<li><code>rstatix<\/code> provides pipe-friendly R functions for easy statistical analyses.<\/li>\n<li><code>datarium<\/code>: contains required data sets for this chapter.<\/li>\n<\/ul>\n<p>Start by loading the following required packages:<\/p>\n<pre class=\"r\"><code>library(tidyverse)\r\nlibrary(ggpubr)\r\nlibrary(rstatix)<\/code><\/pre>\n<\/div>\n<div id=\"research-questions\" class=\"section level2\">\n<h2>Research questions<\/h2>\n<p>Typical research questions are:<\/p>\n<ol style=\"list-style-type: decimal;\">\n<li>whether the mean of group A (<span class=\"math inline\">\\(m_A\\)<\/span>) is equal to the mean of group B (<span class=\"math inline\">\\(m_B\\)<\/span>)?<\/li>\n<li>whether the mean of group A (<span class=\"math inline\">\\(m_A\\)<\/span>) is less than the mean of group B (<span class=\"math inline\">\\(m_B\\)<\/span>)?<\/li>\n<li>whether the mean of group A (<span class=\"math inline\">\\(m_A\\)<\/span>) is greater than the mean of group B (<span class=\"math inline\">\\(m_B\\)<\/span>)?<\/li>\n<\/ol>\n<\/div>\n<div id=\"statistical-hypotheses\" class=\"section level2\">\n<h2>Statistical hypotheses<\/h2>\n<p>In statistics, we can define the corresponding null hypothesis (<span class=\"math inline\">\\(H_0\\)<\/span>) as follow:<\/p>\n<ol style=\"list-style-type: decimal;\">\n<li><span class=\"math inline\">\\(H_0: m_A = m_B\\)<\/span><\/li>\n<li><span class=\"math inline\">\\(H_0: m_A \\leq m_B\\)<\/span><\/li>\n<li><span class=\"math inline\">\\(H_0: m_A \\geq m_B\\)<\/span><\/li>\n<\/ol>\n<p>The corresponding alternative hypotheses (<span class=\"math inline\">\\(H_a\\)<\/span>) are as follow:<\/p>\n<ol style=\"list-style-type: decimal;\">\n<li><span class=\"math inline\">\\(H_a: m_A \\ne m_B\\)<\/span> (different)<\/li>\n<li><span class=\"math inline\">\\(H_a: m_A &gt; m_B\\)<\/span> (greater)<\/li>\n<li><span class=\"math inline\">\\(H_a: m_A &lt; m_B\\)<\/span> (less)<\/li>\n<\/ol>\n<div class=\"notice\">\n<p>Note that:<\/p>\n<ul>\n<li>Hypotheses 1) are called two-tailed tests<\/li>\n<li>Hypotheses 2) and 3) are called one-tailed tests<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<div id=\"formula\" class=\"section level2\">\n<h2>Formula<\/h2>\n<p>The independent samples t-test comes in two different forms, Student\u2019s t-test and Welch\u2019s t-test.<\/p>\n<p>The classical Student\u2019s t-test is more restrictive. It assumes that the two groups have the same population variance.<\/p>\n<ol style=\"list-style-type: decimal;\">\n<li><strong>Classical two independent samples t-test<\/strong> (Student t-test). If the variance of the two groups are equivalent (<strong>homoscedasticity<\/strong>), the t-test value, comparing the two samples (A and B), can be calculated as follow.<\/li>\n<\/ol>\n<p><span class=\"math display\">\\[<br \/>\nt = \\frac{m_A - m_B}{\\sqrt{ \\frac{S^2}{n_A} + \\frac{S^2}{n_B} }}<br \/>\n\\]<\/span><\/p>\n<p>where,<\/p>\n<ul>\n<li><span class=\"math inline\">\\(m_A\\)<\/span> and <span class=\"math inline\">\\(m_B\\)<\/span> represent the mean value of the group A and B, respectively.<\/li>\n<li><span class=\"math inline\">\\(n_A\\)<\/span> and <span class=\"math inline\">\\(n_B\\)<\/span> represent the sizes of the group A and B, respectively.<\/li>\n<li><span class=\"math inline\">\\(S^2\\)<\/span> is an estimator of the pooled variance of the two groups. It can be calculated as follow :<\/li>\n<\/ul>\n<p><span class=\"math display\">\\[<br \/>\nS^2 = \\frac{\\sum{(x-m_A)^2}+\\sum{(x-m_B)^2}}{n_A+n_B-2}<br \/>\n\\]<\/span><\/p>\n<p>with degrees of freedom (df): <span class=\"math inline\">\\(df = n_A + n_B - 2\\)<\/span>.<\/p>\n<ol style=\"list-style-type: decimal;\" start=\"2\">\n<li><strong>Welch t-statistic<\/strong>. If the variances of the two groups being compared are different (<strong>heteroscedasticity<\/strong>), it\u2019s possible to use the Welch t-test, which is an adaptation of the Student t-test. The Welch t-statistic is calculated as follow :<\/li>\n<\/ol>\n<p><span class=\"math display\">\\[<br \/>\nt = \\frac{m_A - m_B}{\\sqrt{ \\frac{S_A^2}{n_A} + \\frac{S_B^2}{n_B} }}<br \/>\n\\]<\/span><\/p>\n<p>where, <span class=\"math inline\">\\(S_A\\)<\/span> and <span class=\"math inline\">\\(S_B\\)<\/span> are the standard deviation of the the two groups A and B, respectively.<\/p>\n<p>Unlike the classic Student\u2019s t-test, the Welch t-test formula involves the variance of each of the two groups (<span class=\"math inline\">\\(S_A^2\\)<\/span> and <span class=\"math inline\">\\(S_B^2\\)<\/span>) being compared. In other words, it does not use the pooled variance <span class=\"math inline\">\\(S\\)<\/span>.<\/p>\n<p>The <strong>degrees of freedom<\/strong> of <strong>Welch t-test<\/strong> is estimated as follow :<\/p>\n<p><span class=\"math display\">\\[<br \/>\ndf = (\\frac{S_A^2}{n_A}+ \\frac{S_B^2}{n_B})^2 \/ (\\frac{S_A^4}{n_A^2(n_A-1)} + \\frac{S_B^4}{n_B^2(n_B-1)} )<br \/>\n\\]<\/span><\/p>\n<div class=\"success\">\n<p>A p-value can be computed for the corresponding absolute value of t-statistic (|t|).<\/p>\n<p>If the p-value is inferior or equal to the significance level 0.05, we can reject the null hypothesis and accept the alternative hypothesis. In other words, we can conclude that the mean values of group A and B are significantly different.<\/p>\n<\/div>\n<div class=\"warning\">\n<p>Note that, the Welch t-test is considered as the safer one. Usually, the results of the <strong>classical student\u2019s t-test<\/strong> and the <strong>Welch t-test<\/strong> are very similar unless both the group sizes and the standard deviations are very different.<\/p>\n<\/div>\n<\/div>\n<div id=\"demo-data\" class=\"section level2\">\n<h2>Demo data<\/h2>\n<p>Demo dataset: <code>genderweight<\/code> [in datarium package] containing the weight of 40 individuals (20 women and 20 men).<\/p>\n<p>Load the data and show some random rows by groups:<\/p>\n<pre class=\"r\"><code># Load the data\r\ndata(\"genderweight\", package = \"datarium\")\r\n# Show a sample of the data by group\r\nset.seed(123)\r\ngenderweight %&gt;% sample_n_by(group, size = 2)<\/code><\/pre>\n<pre><code>## # A tibble: 4 x 3\r\n##   id    group weight\r\n##   &lt;fct&gt; &lt;fct&gt;  &lt;dbl&gt;\r\n## 1 6     F       65.0\r\n## 2 15    F       65.9\r\n## 3 29    M       88.9\r\n## 4 37    M       77.0<\/code><\/pre>\n<\/div>\n<div id=\"summary-statistics\" class=\"section level2\">\n<h2>Summary statistics<\/h2>\n<p>Compute some summary statistics by groups: mean and sd (standard deviation)<\/p>\n<pre class=\"r\"><code>genderweight %&gt;%\r\n  group_by(group) %&gt;%\r\n  get_summary_stats(weight, type = \"mean_sd\")<\/code><\/pre>\n<pre><code>## # A tibble: 2 x 5\r\n##   group variable     n  mean    sd\r\n##   &lt;fct&gt; &lt;chr&gt;    &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;\r\n## 1 F     weight      20  63.5  2.03\r\n## 2 M     weight      20  85.8  4.35<\/code><\/pre>\n<\/div>\n<div id=\"visualization\" class=\"section level2\">\n<h2>Visualization<\/h2>\n<p>Visualize the data using box plots. Plot weight by groups.<\/p>\n<pre class=\"r\"><code>bxp &lt;- ggboxplot(\r\n  genderweight, x = \"group\", y = \"weight\", \r\n  ylab = \"Weight\", xlab = \"Groups\", add = \"jitter\"\r\n  )\r\nbxp<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/r-statistics-2-comparing-groups-means\/figures\/071-unpaired-t-test-box-plot-1.png\" width=\"364.8\" \/><\/p>\n<\/div>\n<div id=\"assumptions-and-preleminary-tests\" class=\"section level2\">\n<h2>Assumptions and preleminary tests<\/h2>\n<p>The two-samples independent t-test assume the following characteristics about the data:<\/p>\n<ul>\n<li><strong>Independence of the observations<\/strong>. Each subject should belong to only one group. There is no relationship between the observations in each group.<\/li>\n<li><strong>No significant outliers<\/strong> in the two groups<\/li>\n<li><strong>Normality<\/strong>. the data for each group should be approximately normally distributed.<\/li>\n<li><strong>Homogeneity of variances<\/strong>. the variance of the outcome variable should be equal in each group.<\/li>\n<\/ul>\n<p>In this section, we\u2019ll perform some preliminary tests to check whether these assumptions are met.<\/p>\n<div id=\"identify-outliers\" class=\"section level3\">\n<h3>Identify outliers<\/h3>\n<p>Outliers can be easily identified using boxplot methods, implemented in the R function <code>identify_outliers()<\/code> [rstatix package].<\/p>\n<pre class=\"r\"><code>genderweight %&gt;%\r\n  group_by(group) %&gt;%\r\n  identify_outliers(weight)<\/code><\/pre>\n<pre><code>## # A tibble: 2 x 5\r\n##   group id    weight is.outlier is.extreme\r\n##   &lt;fct&gt; &lt;fct&gt;  &lt;dbl&gt; &lt;lgl&gt;      &lt;lgl&gt;     \r\n## 1 F     20      68.8 TRUE       FALSE     \r\n## 2 M     31      95.1 TRUE       FALSE<\/code><\/pre>\n<div class=\"success\">\n<p>There were no extreme outliers.<\/p>\n<\/div>\n<div class=\"warning\">\n<p>Note that, in the situation where you have extreme outliers, this can be due to: 1) data entry errors, measurement errors or unusual values.<\/p>\n<p>Yo can include the outlier in the analysis anyway if you do not believe the result will be substantially affected. This can be evaluated by comparing the result of the t-test with and without the outlier.<\/p>\n<p>It\u2019s also possible to keep the outliers in the data and perform Wilcoxon test or robust t-test using the WRS2 package.<\/p>\n<\/div>\n<\/div>\n<div id=\"check-normality-by-groups\" class=\"section level3\">\n<h3>Check normality by groups<\/h3>\n<p>The normality assumption can be checked by computing the Shapiro-Wilk test for each group. If the data is normally distributed, the p-value should be greater than 0.05.<\/p>\n<pre class=\"r\"><code>genderweight %&gt;%\r\n  group_by(group) %&gt;%\r\n  shapiro_test(weight)<\/code><\/pre>\n<pre><code>## # A tibble: 2 x 4\r\n##   group variable statistic     p\r\n##   &lt;fct&gt; &lt;chr&gt;        &lt;dbl&gt; &lt;dbl&gt;\r\n## 1 F     weight       0.938 0.224\r\n## 2 M     weight       0.986 0.989<\/code><\/pre>\n<div class=\"success\">\n<p>From the output, the two p-values are greater than the significance level 0.05 indicating that the distribution of the data are not significantly different from the normal distribution. In other words, we can assume the normality.<\/p>\n<\/div>\n<p>You can also create QQ plots for each group. QQ plot draws the correlation between a given data and the normal distribution.<\/p>\n<pre class=\"r\"><code>ggqqplot(genderweight, x = \"weight\", facet.by = \"group\")<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/r-statistics-2-comparing-groups-means\/figures\/071-unpaired-t-test-qqplot-1.png\" width=\"480\" \/><\/p>\n<div class=\"success\">\n<p>All the points fall approximately along the (45-degree) reference line, for each group. So we can assume normality of the data.<\/p>\n<\/div>\n<div class=\"warning\">\n<p>Note that, if your sample size is greater than 50, the normal QQ plot is preferred because at larger sample sizes the Shapiro-Wilk test becomes very sensitive even to a minor deviation from normality.<\/p>\n<p>Note that, in the situation where the data are not normally distributed, it\u2019s recommended to use the non parametric two-samples Wilcoxon test.<\/p>\n<\/div>\n<\/div>\n<div id=\"check-the-equality-of-variances\" class=\"section level3\">\n<h3>Check the equality of variances<\/h3>\n<p>This can be done using the Levene\u2019s test. If the variances of groups are equal, the p-value should be greater than 0.05.<\/p>\n<pre class=\"r\"><code>genderweight %&gt;% levene_test(weight ~ group)<\/code><\/pre>\n<pre><code>## # A tibble: 1 x 4\r\n##     df1   df2 statistic      p\r\n##   &lt;int&gt; &lt;int&gt;     &lt;dbl&gt;  &lt;dbl&gt;\r\n## 1     1    38      6.12 0.0180<\/code><\/pre>\n<div class=\"success\">\n<p>The p-value of the Levene\u2019s test is significant, suggesting that there is a significant difference between the variances of the two groups. Therefore, we\u2019ll use the Welch t-test, which doesn\u2019t assume the equality of the two variances.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"computation\" class=\"section level2\">\n<h2>Computation<\/h2>\n<p>We want to know, whether the average weights are different between groups.<\/p>\n<p>We\u2019ll use the pipe-friendly <code>t_test()<\/code> function [rstatix package], a wrapper around the R base function <code>t.test()<\/code>.<\/p>\n<p>Recall that, by default, R computes the Welch t-test, which is the safer one. This is the test where you do not assume that the variance is the same in the two groups, which results in the fractional degrees of freedom.<\/p>\n<pre class=\"r\"><code>stat.test &lt;- genderweight %&gt;% \r\n  t_test(weight ~ group) %&gt;%\r\n  add_significance()\r\nstat.test<\/code><\/pre>\n<pre><code>## # A tibble: 1 x 9\r\n##   .y.    group1 group2    n1    n2 statistic    df        p p.signif\r\n##   &lt;chr&gt;  &lt;chr&gt;  &lt;chr&gt;  &lt;int&gt; &lt;int&gt;     &lt;dbl&gt; &lt;dbl&gt;    &lt;dbl&gt; &lt;chr&gt;   \r\n## 1 weight F      M         20    20     -20.8  26.9 4.30e-18 ****<\/code><\/pre>\n<p>If you want to assume the equality of variances (Student t-test), specify the option <code>var.equal = TRUE<\/code>:<\/p>\n<pre class=\"r\"><code>stat.test2 &lt;- genderweight %&gt;%\r\n  t_test(weight ~ group, var.equal = TRUE) %&gt;%\r\n  add_significance()\r\nstat.test2<\/code><\/pre>\n<p>The results above show the following components:<\/p>\n<ul>\n<li><code>.y.<\/code>: the y variable used in the test.<\/li>\n<li><code>group1,group2<\/code>: the compared groups in the pairwise tests.<\/li>\n<li><code>statistic<\/code>: Test statistic used to compute the p-value.<\/li>\n<li><code>df<\/code>: degrees of freedom.<\/li>\n<li><code>p<\/code>: p-value.<\/li>\n<\/ul>\n<div class=\"warning\">\n<p>Note that, you can obtain a detailed result by specifying the option <code>detailed = TRUE<\/code>.<\/p>\n<\/div>\n<p>Note that, to compute one tailed two samples t-test, you can specify the option <code>alternative<\/code> as follow.<\/p>\n<ul>\n<li>if you want to test whether the average women\u2019s weight (group 1) is less than the average men\u2019s weight (group 2), type this:<\/li>\n<\/ul>\n<pre class=\"r\"><code>genderweight %&gt;% \r\n  t_test(weight ~ group, alternative = \"less\")<\/code><\/pre>\n<ul>\n<li>Or, if you want to test whether the average women\u2019s weight (group 1) is greater than the average men\u2019s weight (group 2), type this<\/li>\n<\/ul>\n<pre class=\"r\"><code>genderweight %&gt;% \r\n  t_test(weight ~ group, alternative = \"greater\")<\/code><\/pre>\n<\/div>\n<div id=\"effect-size\" class=\"section level2\">\n<h2>Effect size<\/h2>\n<div id=\"cohens-d-for-student-t-test\" class=\"section level3\">\n<h3>Cohen\u2019s d for Student t-test<\/h3>\n<p>There are multiple version of Cohen\u2019s d for Student t-test. The most commonly used version of the Student t-test effect size, comparing two groups (A and B), is calculated by dividing the mean difference between the groups by the pooled standard deviation.<\/p>\n<p><strong>Cohen\u2019s d formula<\/strong>:<\/p>\n<p><span class=\"math display\">\\[<br \/>\nd = \\frac{m_A - m_B}{SD_{pooled}}<br \/>\n\\]<\/span><\/p>\n<p>where,<\/p>\n<ul>\n<li><span class=\"math inline\">\\(m_A\\)<\/span> and <span class=\"math inline\">\\(m_B\\)<\/span> represent the mean value of the group A and B, respectively.<\/li>\n<li><span class=\"math inline\">\\(n_A\\)<\/span> and <span class=\"math inline\">\\(n_B\\)<\/span> represent the sizes of the group A and B, respectively.<\/li>\n<li><span class=\"math inline\">\\(SD_{pooled}\\)<\/span> is an estimator of the pooled standard deviation of the two groups. It can be calculated as follow :<br \/>\n<span class=\"math display\">\\[<br \/>\nSD_{pooled} = \\sqrt{\\frac{\\sum{(x-m_A)^2}+\\sum{(x-m_B)^2}}{n_A+n_B-2}}<br \/>\n\\]<\/span><\/li>\n<\/ul>\n<p><strong>Calculation<\/strong>. If the option <code>var.equal = TRUE<\/code>, then the pooled SD is used when computing the Cohen\u2019s d.<\/p>\n<pre class=\"r\"><code>genderweight %&gt;%  cohens_d(weight ~ group, var.equal = TRUE)<\/code><\/pre>\n<pre><code>## # A tibble: 1 x 7\r\n##   .y.    group1 group2 effsize    n1    n2 magnitude\r\n## * &lt;chr&gt;  &lt;chr&gt;  &lt;chr&gt;    &lt;dbl&gt; &lt;int&gt; &lt;int&gt; &lt;ord&gt;    \r\n## 1 weight F      M        -6.57    20    20 large<\/code><\/pre>\n<div class=\"success\">\n<p>There is a large effect size, d = 6.57.<\/p>\n<\/div>\n<p>Note that, for small sample size (&lt; 50), the Cohen\u2019s d tends to over-inflate results. There exists a <strong>Hedge\u2019s Corrected version of the Cohen\u2019s d<\/strong> <span class=\"citation\">(Hedges and Olkin 1985)<\/span>, which reduces effect sizes for small samples by a few percentage points. The correction is introduced by multiplying the usual value of d by <code>(N-3)\/(N-2.25)<\/code> (for unpaired t-test) and by <code>(n1-2)\/(n1-1.25)<\/code> for paired t-test; where N is the total size of the two groups being compared <code>(N = n1 + n2)<\/code>.<\/p>\n<\/div>\n<div id=\"cohens-d-for-welch-t-test\" class=\"section level3\">\n<h3>Cohen\u2019s d for Welch t-test<\/h3>\n<p>The Welch test is a variant of t-test used when the equality of variance can\u2019t be assumed. The effect size can be computed by dividing the mean difference between the groups by the \u201caveraged\u201d standard deviation.<\/p>\n<p><strong>Cohen\u2019s d formula<\/strong>:<\/p>\n<p><span class=\"math display\">\\[<br \/>\nd = \\frac{m_A - m_B}{\\sqrt{(Var_1 + Var_2)\/2}}<br \/>\n\\]<\/span><\/p>\n<p>where,<\/p>\n<ul>\n<li><span class=\"math inline\">\\(m_A\\)<\/span> and <span class=\"math inline\">\\(m_B\\)<\/span> represent the mean value of the group A and B, respectively.<\/li>\n<li><span class=\"math inline\">\\(Var_1\\)<\/span> and <span class=\"math inline\">\\(Var_2\\)<\/span> are the variance of the two groups.<\/li>\n<\/ul>\n<p><strong>Calculation<\/strong>:<\/p>\n<pre class=\"r\"><code>genderweight %&gt;% cohens_d(weight ~ group, var.equal = FALSE)<\/code><\/pre>\n<pre><code>## # A tibble: 1 x 7\r\n##   .y.    group1 group2 effsize    n1    n2 magnitude\r\n## * &lt;chr&gt;  &lt;chr&gt;  &lt;chr&gt;    &lt;dbl&gt; &lt;int&gt; &lt;int&gt; &lt;ord&gt;    \r\n## 1 weight F      M        -6.57    20    20 large<\/code><\/pre>\n<div class=\"warning\">\n<p>Note that, when group sizes are equal and group variances are homogeneous, Cohen\u2019s d for the standard Student and Welch t-tests are identical.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"report\" class=\"section level2\">\n<h2>Report<\/h2>\n<p>We could report the result as follow:<\/p>\n<p>The mean weight in female group was 63.5 (SD = 2.03), whereas the mean in male group was 85.8 (SD = 4.3). A Welch two-samples t-test showed that the difference was statistically significant, t(26.9) = -20.8, p &lt; 0.0001, d = 6.57; where, t(26.9) is shorthand notation for a Welch t-statistic that has 26.9 degrees of freedom.<\/p>\n<pre class=\"r\"><code>stat.test &lt;- stat.test %&gt;% add_xy_position(x = \"group\")\r\nbxp + \r\n  stat_pvalue_manual(stat.test, tip.length = 0) +\r\n  labs(subtitle = get_test_label(stat.test, detailed = TRUE))<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/r-statistics-2-comparing-groups-means\/figures\/071-unpaired-t-test-two-sample-box-plot-with-p-values-1.png\" width=\"412.8\" \/><\/p>\n<\/div>\n<div id=\"summary\" class=\"section level2\">\n<h2>Summary<\/h2>\n<p>This article describes the formula and the basics of the unpaired t-test or independent t-test. Examples of R codes are provided to check the assumptions, computing the test and the effect size, interpreting and reporting the results.<\/p>\n<\/div>\n<div id=\"references\" class=\"section level2 unnumbered\">\n<h2>References<\/h2>\n<div id=\"refs\" class=\"references\">\n<div id=\"ref-cohen1998\">\n<p>Cohen, J. 1998. <em>Statistical Power Analysis for the Behavioral Sciences<\/em>. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates.<\/p>\n<\/div>\n<div id=\"ref-hedges1985\">\n<p>Hedges, Larry, and Ingram Olkin. 1985. \u201cStatistical Methods in Meta-Analysis.\u201d In <em>Stat Med<\/em>. Vol. 20. doi:<a href=\"https:\/\/doi.org\/10.2307\/1164953\">10.2307\/1164953<\/a>.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p><!--end rdoc--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Describes the unpaired t-test, which is used to compare the mean of two independent groups. You will learn the formula, assumptions, calculation, visualization, effect size measure using the Cohen&#8217;s d, interpretation and reporting in R. The Student&#8217;s t-test and the Welch t-test are described.<\/p>\n","protected":false},"author":1,"featured_media":8915,"parent":11648,"menu_order":71,"comment_status":"open","ping_status":"closed","template":"","class_list":["post-11650","dt_lessons","type-dt_lessons","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Unpaired T-Test : Excellent Complete Reference You Will Love - Datanovia<\/title>\n<meta name=\"description\" content=\"Describes the unpaired t-test, which is used to compare the mean of two independent groups. You will learn the formula, assumptions, calculation, visualization, effect size measure using the Cohen&#039;s d, interpretation and reporting in R. The Student&#039;s t-test and the Welch t-test are described.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Unpaired T-Test : Excellent Complete Reference You Will Love - Datanovia\" \/>\n<meta property=\"og:description\" content=\"Describes the unpaired t-test, which is used to compare the mean of two independent groups. You will learn the formula, assumptions, calculation, visualization, effect size measure using the Cohen&#039;s d, interpretation and reporting in R. The Student&#039;s t-test and the Welch t-test are described.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/\" \/>\n<meta property=\"og:site_name\" content=\"Datanovia\" \/>\n<meta property=\"article:modified_time\" content=\"2020-05-22T10:10:53+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"13 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/\",\"name\":\"Unpaired T-Test : Excellent Complete Reference You Will Love - Datanovia\",\"isPartOf\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg\",\"datePublished\":\"2019-12-25T10:24:26+00:00\",\"dateModified\":\"2020-05-22T10:10:53+00:00\",\"description\":\"Describes the unpaired t-test, which is used to compare the mean of two independent groups. You will learn the formula, assumptions, calculation, visualization, effect size measure using the Cohen's d, interpretation and reporting in R. The Student's t-test and the Welch t-test are described.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#primaryimage\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg\",\"width\":1024,\"height\":512},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.datanovia.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lessons\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Types of T-Test\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Unpaired T-Test\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"name\":\"Datanovia\",\"description\":\"Data Mining and Statistics for Decision Support\",\"publisher\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.datanovia.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\",\"name\":\"Datanovia\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"width\":98,\"height\":99,\"caption\":\"Datanovia\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Unpaired T-Test : Excellent Complete Reference You Will Love - Datanovia","description":"Describes the unpaired t-test, which is used to compare the mean of two independent groups. You will learn the formula, assumptions, calculation, visualization, effect size measure using the Cohen's d, interpretation and reporting in R. The Student's t-test and the Welch t-test are described.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/","og_locale":"en_US","og_type":"article","og_title":"Unpaired T-Test : Excellent Complete Reference You Will Love - Datanovia","og_description":"Describes the unpaired t-test, which is used to compare the mean of two independent groups. You will learn the formula, assumptions, calculation, visualization, effect size measure using the Cohen's d, interpretation and reporting in R. The Student's t-test and the Welch t-test are described.","og_url":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/","og_site_name":"Datanovia","article_modified_time":"2020-05-22T10:10:53+00:00","og_image":[{"width":1024,"height":512,"url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"13 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/","url":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/","name":"Unpaired T-Test : Excellent Complete Reference You Will Love - Datanovia","isPartOf":{"@id":"https:\/\/www.datanovia.com\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#primaryimage"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#primaryimage"},"thumbnailUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg","datePublished":"2019-12-25T10:24:26+00:00","dateModified":"2020-05-22T10:10:53+00:00","description":"Describes the unpaired t-test, which is used to compare the mean of two independent groups. You will learn the formula, assumptions, calculation, visualization, effect size measure using the Cohen's d, interpretation and reporting in R. The Student's t-test and the Welch t-test are described.","breadcrumb":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#primaryimage","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/vanneau.e.peronne..oisillon.jpg","width":1024,"height":512},{"@type":"BreadcrumbList","@id":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/unpaired-t-test\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.datanovia.com\/en\/"},{"@type":"ListItem","position":2,"name":"Lessons","item":"https:\/\/www.datanovia.com\/en\/lessons\/"},{"@type":"ListItem","position":3,"name":"Types of T-Test","item":"https:\/\/www.datanovia.com\/en\/lessons\/types-of-t-test\/"},{"@type":"ListItem","position":4,"name":"Unpaired T-Test"}]},{"@type":"WebSite","@id":"https:\/\/www.datanovia.com\/en\/#website","url":"https:\/\/www.datanovia.com\/en\/","name":"Datanovia","description":"Data Mining and Statistics for Decision Support","publisher":{"@id":"https:\/\/www.datanovia.com\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.datanovia.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.datanovia.com\/en\/#organization","name":"Datanovia","url":"https:\/\/www.datanovia.com\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","width":98,"height":99,"caption":"Datanovia"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/"}}]}},"multi-rating":{"mr_rating_results":[]},"_links":{"self":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/11650","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons"}],"about":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/types\/dt_lessons"}],"author":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/comments?post=11650"}],"version-history":[{"count":2,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/11650\/revisions"}],"predecessor-version":[{"id":16361,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/11650\/revisions\/16361"}],"up":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/11648"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media\/8915"}],"wp:attachment":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media?parent=11650"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}