{"id":8315,"date":"2018-12-30T14:44:49","date_gmt":"2018-12-30T12:44:49","guid":{"rendered":"https:\/\/www.datanovia.com\/en\/?post_type=dt_lessons&#038;p=8315"},"modified":"2019-11-17T21:43:21","modified_gmt":"2019-11-17T19:43:21","slug":"ggplot-boxplot","status":"publish","type":"dt_lessons","link":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/","title":{"rendered":"GGPlot Boxplot"},"content":{"rendered":"<div id=\"rdoc\">\n<p><strong>Boxplots<\/strong> (or <strong>Box plots<\/strong>) are used to visualize the distribution of a grouped continuous variable through their quartiles.<\/p>\n<p>Box Plots have the advantage of taking up less space compared to Histogram and Density plot. This is useful when comparing distributions between many groups.<\/p>\n<p>Visualizing data using boxplots makes it possible to:<\/p>\n<ul>\n<li>Inspect the key values of the data, including: the average, median, first and third quartiles, etc<\/li>\n<li>Identify potential outliers in the data<\/li>\n<li>See whether the data is tightly grouped, symmetrical or skewed, etc<\/li>\n<\/ul>\n<p>This article describes how to create and customize <strong>boxplot<\/strong> using the <strong>ggplot2<\/strong> R package.<\/p>\n<p>Contents:<\/p>\n<div id=\"TOC\">\n<ul>\n<li><a href=\"#key-r-functions\">Key R functions<\/a><\/li>\n<li><a href=\"#data-preparation\">Data preparation<\/a><\/li>\n<li><a href=\"#loading-required-r-package\">Loading required R package<\/a><\/li>\n<li><a href=\"#basic-boxplots\">Basic boxplots<\/a><\/li>\n<li><a href=\"#change-boxplot-colors-by-groups\">Change boxplot colors by groups:<\/a><\/li>\n<li><a href=\"#create-a-boxplot-with-multiple-groups\">Create a boxplot with multiple groups<\/a><\/li>\n<li><a href=\"#multiple-panel-boxplots\">Multiple panel boxplots<\/a><\/li>\n<li><a href=\"#conclusion\">Conclusion<\/a><\/li>\n<\/ul>\n<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div class='dt-sc-ico-content type1'><div class='custom-icon' ><a href='https:\/\/www.datanovia.com\/en\/product\/ggplot2-essentials-for-great-data-visualization-in-r\/' target='_blank'><span class='fa fa-book'><\/span><\/a><\/div><h4><a href='https:\/\/www.datanovia.com\/en\/product\/ggplot2-essentials-for-great-data-visualization-in-r\/' target='_blank'> Related Book <\/a><\/h4>GGPlot2 Essentials for Great Data Visualization in R<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div id=\"key-r-functions\" class=\"section level2\">\n<h2>Key R functions<\/h2>\n<ul>\n<li>Key R function: <code>geom_boxplot()<\/code> [ggplot2 package]<\/li>\n<li>Key arguments to customize the plot:\n<ul>\n<li><code>width<\/code>: the width of the box plot<\/li>\n<li><code>notch<\/code>: logical. If TRUE, creates a <strong>notched boxplot<\/strong>. The notch displays a confidence interval around the median which is normally based on the <code>median +\/- 1.58*IQR\/sqrt(n)<\/code>. Notches are used to compare groups; if the notches of two boxes do not overlap, this is a strong evidence that the medians differ.<\/li>\n<li><code>color<\/code>, <code>size<\/code>, <code>linetype<\/code>: Border line color, size and type<\/li>\n<li><code>fill<\/code>: box plot areas fill color<\/li>\n<li><code>outlier.colour<\/code>, <code>outlier.shape<\/code>, <code>outlier.size<\/code>: The color, the shape and the size for outlying points.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n<div id=\"data-preparation\" class=\"section level2\">\n<h2>Data preparation<\/h2>\n<ul>\n<li>Demo dataset: <code>ToothGrowth<\/code>\n<ul>\n<li>Continuous variable: <code>len<\/code> (tooth length). Used on y-axis<\/li>\n<li>Grouping variable: <code>dose<\/code> (dose levels of vitamin C: 0.5, 1, and 2 mg\/day). Used on x-axis.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>First, convert the variable <code>dose<\/code> from a numeric to a discrete factor variable:<\/p>\n<pre class=\"r\"><code>data(\"ToothGrowth\")\r\nToothGrowth$dose &lt;- as.factor(ToothGrowth$dose)\r\nhead(ToothGrowth, 4)<\/code><\/pre>\n<pre><code>##    len supp dose\r\n## 1  4.2   VC  0.5\r\n## 2 11.5   VC  0.5\r\n## 3  7.3   VC  0.5\r\n## 4  5.8   VC  0.5<\/code><\/pre>\n<\/div>\n<div id=\"loading-required-r-package\" class=\"section level2\">\n<h2>Loading required R package<\/h2>\n<p>Load the ggplot2 package and set the default theme to <code>theme_classic()<\/code> with the legend at the top of the plot:<\/p>\n<pre class=\"r\"><code>library(ggplot2)\r\ntheme_set(\r\n  theme_classic() +\r\n    theme(legend.position = \"top\")\r\n  )<\/code><\/pre>\n<\/div>\n<div id=\"basic-boxplots\" class=\"section level2\">\n<h2>Basic boxplots<\/h2>\n<p>We start by initiating a plot named <code>e<\/code>, then we\u2019ll add layers:<\/p>\n<pre class=\"r\"><code># Default plot\r\ne &lt;- ggplot(ToothGrowth, aes(x = dose, y = len))\r\ne + geom_boxplot()\r\n\r\n# Notched box plot with mean points\r\ne + geom_boxplot(notch = TRUE, fill = \"lightgray\")+\r\n  stat_summary(fun.y = mean, geom = \"point\",\r\n               shape = 18, size = 2.5, color = \"#FC4E07\")<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-geom_boxplot-create-basic-boxplots-1.png\" width=\"240\" \/><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-geom_boxplot-create-basic-boxplots-2.png\" width=\"240\" \/><\/p>\n<p>Note that, it\u2019s possible to use the function <code>scale_x_discrete()<\/code> for:<\/p>\n<ul>\n<li>choosing which items to display: for example c(\u201c0.5\u201d, \u201c2\u201d),<\/li>\n<li>changing the order of items: for example from c(\u201c0.5\u201d, \u201c1\u201d, \u201c2\u201d) to c(\u201c2\u201d, \u201c0.5\u201d, \u201c1\u201d)<\/li>\n<\/ul>\n<p>For example, type this:<\/p>\n<pre class=\"r\"><code># Choose which items to display: group \"0.5\" and \"2\"\r\ne + geom_boxplot() + \r\n  scale_x_discrete(limits=c(\"0.5\", \"2\"))\r\n\r\n# Change the default order of items\r\ne + geom_boxplot() +\r\n  scale_x_discrete(limits=c(\"2\", \"0.5\", \"1\"))<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-scale_x_discre_boxplot-change-group-order-1.png\" width=\"192\" \/><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-scale_x_discre_boxplot-change-group-order-2.png\" width=\"192\" \/><\/p>\n<\/div>\n<div id=\"change-boxplot-colors-by-groups\" class=\"section level2\">\n<h2>Change boxplot colors by groups:<\/h2>\n<p>The following R code will change the boxplot line and fill color. The functions <code>scale_color_manual()<\/code> and <code>scale_fill_manual()<\/code> are used to specify custom colors for each group.<\/p>\n<pre class=\"r\"><code># Color by group (dose)\r\ne + geom_boxplot(aes(color = dose))+\r\n  scale_color_manual(values = c(\"#00AFBB\", \"#E7B800\", \"#FC4E07\"))\r\n\r\n# Change fill color by group (dose)\r\ne + geom_boxplot(aes(fill = dose)) +\r\n  scale_fill_manual(values = c(\"#00AFBB\", \"#E7B800\", \"#FC4E07\"))<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-geom_boxplot-color-by-groups-1.png\" width=\"288\" \/><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-geom_boxplot-color-by-groups-2.png\" width=\"288\" \/><\/p>\n<\/div>\n<div id=\"create-a-boxplot-with-multiple-groups\" class=\"section level2\">\n<h2>Create a boxplot with multiple groups<\/h2>\n<p>Two different grouping variables are used: <code>dose<\/code> on x-axis and <code>supp<\/code> as fill color (legend variable).<\/p>\n<p>The space between the grouped box plots is adjusted using the function <code>position_dodge()<\/code>.<\/p>\n<pre class=\"r\"><code>e2 &lt;- e + \r\n  geom_boxplot(aes(fill = supp), position = position_dodge(0.9) ) +\r\n  scale_fill_manual(values = c(\"#999999\", \"#E69F00\"))\r\ne2<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-boxplot-multiple-groups-1.png\" width=\"288\" \/><\/p>\n<\/div>\n<div id=\"multiple-panel-boxplots\" class=\"section level2\">\n<h2>Multiple panel boxplots<\/h2>\n<p>You can split the plot into multiple panel using the function <code>facet_wrap()<\/code>:<\/p>\n<pre class=\"r\"><code>e2 + facet_wrap(~supp)<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/ggplot2\/figures\/005-ggplot-boxplot-multiple-panel-boxplot-1.png\" width=\"576\" \/><\/p>\n<\/div>\n<div id=\"conclusion\" class=\"section level2\">\n<h2>Conclusion<\/h2>\n<p>This article describes how to create a boxplot using the ggplot2 package.<\/p>\n<\/div>\n<\/div>\n<p><!--end rdoc--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.<\/p>\n","protected":false},"author":1,"featured_media":7879,"parent":0,"menu_order":4,"comment_status":"open","ping_status":"closed","template":"","class_list":["post-8315","dt_lessons","type-dt_lessons","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>GGPlot Boxplot Best Reference - Datanovia<\/title>\n<meta name=\"description\" content=\"Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"GGPlot Boxplot Best Reference - Datanovia\" \/>\n<meta property=\"og:description\" content=\"Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/\" \/>\n<meta property=\"og:site_name\" content=\"Datanovia\" \/>\n<meta property=\"article:modified_time\" content=\"2019-11-17T19:43:21+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/\",\"name\":\"GGPlot Boxplot Best Reference - Datanovia\",\"isPartOf\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg\",\"datePublished\":\"2018-12-30T12:44:49+00:00\",\"dateModified\":\"2019-11-17T19:43:21+00:00\",\"description\":\"Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#primaryimage\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg\",\"width\":1024,\"height\":512},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.datanovia.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lessons\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"GGPlot Boxplot\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"name\":\"Datanovia\",\"description\":\"Data Mining and Statistics for Decision Support\",\"publisher\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.datanovia.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\",\"name\":\"Datanovia\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"width\":98,\"height\":99,\"caption\":\"Datanovia\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"GGPlot Boxplot Best Reference - Datanovia","description":"Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/","og_locale":"en_US","og_type":"article","og_title":"GGPlot Boxplot Best Reference - Datanovia","og_description":"Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.","og_url":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/","og_site_name":"Datanovia","article_modified_time":"2019-11-17T19:43:21+00:00","og_image":[{"width":1024,"height":512,"url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/","url":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/","name":"GGPlot Boxplot Best Reference - Datanovia","isPartOf":{"@id":"https:\/\/www.datanovia.com\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#primaryimage"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#primaryimage"},"thumbnailUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg","datePublished":"2018-12-30T12:44:49+00:00","dateModified":"2019-11-17T19:43:21+00:00","description":"Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.","breadcrumb":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#primaryimage","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_7119.jpg","width":1024,"height":512},{"@type":"BreadcrumbList","@id":"https:\/\/www.datanovia.com\/en\/lessons\/ggplot-boxplot\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.datanovia.com\/en\/"},{"@type":"ListItem","position":2,"name":"Lessons","item":"https:\/\/www.datanovia.com\/en\/lessons\/"},{"@type":"ListItem","position":3,"name":"GGPlot Boxplot"}]},{"@type":"WebSite","@id":"https:\/\/www.datanovia.com\/en\/#website","url":"https:\/\/www.datanovia.com\/en\/","name":"Datanovia","description":"Data Mining and Statistics for Decision Support","publisher":{"@id":"https:\/\/www.datanovia.com\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.datanovia.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.datanovia.com\/en\/#organization","name":"Datanovia","url":"https:\/\/www.datanovia.com\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","width":98,"height":99,"caption":"Datanovia"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/"}}]}},"multi-rating":{"mr_rating_results":[]},"_links":{"self":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8315","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons"}],"about":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/types\/dt_lessons"}],"author":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/comments?post=8315"}],"version-history":[{"count":0,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8315\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media\/7879"}],"wp:attachment":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media?parent=8315"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}