{"id":8078,"date":"2018-10-25T02:04:02","date_gmt":"2018-10-25T00:04:02","guid":{"rendered":"https:\/\/www.datanovia.com\/en\/?post_type=dt_lessons&#038;p=8078"},"modified":"2018-11-05T08:24:22","modified_gmt":"2018-11-05T06:24:22","slug":"hierarchical-k-means-clustering-optimize-clusters","status":"publish","type":"dt_lessons","link":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/","title":{"rendered":"Hierarchical K-Means Clustering: Optimize Clusters"},"content":{"rendered":"<div id=\"rdoc\">\n<p><a href=\"https:\/\/www.datanovia.com\/en\/lessons\/k-means-clustering-in-r-algorith-and-practical-examples\/\">K-means<\/a> represents one of the most popular clustering algorithm. However, it has some limitations: it requires the user to specify the number of clusters in advance and selects initial centroids randomly. The final k-means clustering solution is very sensitive to this initial random selection of cluster centers. The result might be (slightly) different each time you compute k-means.<\/p>\n<div class=\"block\">\n<p>In this chapter, we described an hybrid method, named <strong>hierarchical k-means clustering<\/strong> (hkmeans), for improving k-means results.<\/p>\n<\/div>\n<p>Contents:<\/p>\n<div id=\"TOC\">\n<ul>\n<li><a href=\"#algorithm\">Algorithm<\/a><\/li>\n<li><a href=\"#r-code\">R code<\/a><\/li>\n<li><a href=\"#summary\">Summary<\/a><\/li>\n<\/ul>\n<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div class='dt-sc-ico-content type1'><div class='custom-icon' ><a href='https:\/\/www.datanovia.com\/en\/product\/practical-guide-to-cluster-analysis-in-r\/' target='_blank'><span class='fa fa-book'><\/span><\/a><\/div><h4><a href='https:\/\/www.datanovia.com\/en\/product\/practical-guide-to-cluster-analysis-in-r\/' target='_blank'> Related Book <\/a><\/h4>Practical Guide to Cluster Analysis in R<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div id=\"algorithm\" class=\"section level2\">\n<h2>Algorithm<\/h2>\n<p>The algorithm is summarized as follow:<\/p>\n<ol style=\"list-style-type: decimal;\">\n<li>Compute hierarchical clustering and cut the tree into k-clusters<\/li>\n<li>Compute the center (i.e the mean) of each cluster<\/li>\n<li>Compute k-means by using the set of cluster centers (defined in step 2) as the initial cluster centers<\/li>\n<\/ol>\n<div class=\"notice\">\n<p>Note that, k-means algorithm will improve the initial partitioning generated at the step 2 of the algorithm. Hence, the initial partitioning can be slightly different from the final partitioning obtained in the step 4.<\/p>\n<\/div>\n<\/div>\n<div id=\"r-code\" class=\"section level2\">\n<h2>R code<\/h2>\n<p>The R function <em>hkmeans<\/em>() [in <em>factoextra<\/em>], provides an easy solution to compute the hierarchical k-means clustering. The format of the result is similar to the one provided by the standard kmeans() function (see Chapter @ref(kmeans-clustering)).<\/p>\n<p>To install factoextra, type this: <em>install.packages(\u201cfactoextra\u201d)<\/em>.<\/p>\n<p>We\u2019ll use the USArrest data set and we start by standardizing the data:<\/p>\n<pre class=\"r\"><code>df &lt;- scale(USArrests)<\/code><\/pre>\n<pre class=\"r\"><code># Compute hierarchical k-means clustering\r\nlibrary(factoextra)\r\nres.hk &lt;-hkmeans(df, 4)\r\n# Elements returned by hkmeans()\r\nnames(res.hk)<\/code><\/pre>\n<pre><code>##  [1] \"cluster\"      \"centers\"      \"totss\"        \"withinss\"    \r\n##  [5] \"tot.withinss\" \"betweenss\"    \"size\"         \"iter\"        \r\n##  [9] \"ifault\"       \"data\"         \"hclust\"<\/code><\/pre>\n<p>To print all the results, type this:<\/p>\n<pre class=\"r\"><code># Print the results\r\nres.hk<\/code><\/pre>\n<pre class=\"r\"><code># Visualize the tree\r\nfviz_dend(res.hk, cex = 0.6, palette = \"jco\", \r\n          rect = TRUE, rect_border = \"jco\", rect_fill = TRUE)<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/020-hierarchical-k-means-clustering-hierarchical-k-means-clustering-1.png\" width=\"518.4\" \/><\/p>\n<pre class=\"r\"><code># Visualize the hkmeans final clusters\r\nfviz_cluster(res.hk, palette = \"jco\", repel = TRUE,\r\n             ggtheme = theme_classic())<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/020-hierarchical-k-means-clustering-hierarchical-k-means-clustering-2.png\" width=\"518.4\" \/><\/p>\n<\/div>\n<div id=\"summary\" class=\"section level2\">\n<h2>Summary<\/h2>\n<p>We described hybrid <strong>hierarchical k-means clustering<\/strong> for improving k-means results.<\/p>\n<\/div>\n<\/div>\n<p><!--end rdoc--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The hierarchical k-means clustering is an hybrid approach for improving k-means results. In this article, you will learn how to compute hierarchical k-means clustering in R<\/p>\n","protected":false},"author":1,"featured_media":7884,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","class_list":["post-8078","dt_lessons","type-dt_lessons","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Hierarchical K-Means Clustering: Optimize Clusters - Datanovia<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hierarchical K-Means Clustering: Optimize Clusters - Datanovia\" \/>\n<meta property=\"og:description\" content=\"The hierarchical k-means clustering is an hybrid approach for improving k-means results. In this article, you will learn how to compute hierarchical k-means clustering in R\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/\" \/>\n<meta property=\"og:site_name\" content=\"Datanovia\" \/>\n<meta property=\"article:modified_time\" content=\"2018-11-05T06:24:22+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/\",\"name\":\"Hierarchical K-Means Clustering: Optimize Clusters - Datanovia\",\"isPartOf\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg\",\"datePublished\":\"2018-10-25T00:04:02+00:00\",\"dateModified\":\"2018-11-05T06:24:22+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#primaryimage\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg\",\"width\":1024,\"height\":512},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.datanovia.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lessons\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Hierarchical K-Means Clustering: Optimize Clusters\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"name\":\"Datanovia\",\"description\":\"Data Mining and Statistics for Decision Support\",\"publisher\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.datanovia.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\",\"name\":\"Datanovia\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"width\":98,\"height\":99,\"caption\":\"Datanovia\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hierarchical K-Means Clustering: Optimize Clusters - Datanovia","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/","og_locale":"en_US","og_type":"article","og_title":"Hierarchical K-Means Clustering: Optimize Clusters - Datanovia","og_description":"The hierarchical k-means clustering is an hybrid approach for improving k-means results. In this article, you will learn how to compute hierarchical k-means clustering in R","og_url":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/","og_site_name":"Datanovia","article_modified_time":"2018-11-05T06:24:22+00:00","og_image":[{"width":1024,"height":512,"url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/","url":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/","name":"Hierarchical K-Means Clustering: Optimize Clusters - Datanovia","isPartOf":{"@id":"https:\/\/www.datanovia.com\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#primaryimage"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#primaryimage"},"thumbnailUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg","datePublished":"2018-10-25T00:04:02+00:00","dateModified":"2018-11-05T06:24:22+00:00","breadcrumb":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#primaryimage","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_4989-1.jpg","width":1024,"height":512},{"@type":"BreadcrumbList","@id":"https:\/\/www.datanovia.com\/en\/lessons\/hierarchical-k-means-clustering-optimize-clusters\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.datanovia.com\/en\/"},{"@type":"ListItem","position":2,"name":"Lessons","item":"https:\/\/www.datanovia.com\/en\/lessons\/"},{"@type":"ListItem","position":3,"name":"Hierarchical K-Means Clustering: Optimize Clusters"}]},{"@type":"WebSite","@id":"https:\/\/www.datanovia.com\/en\/#website","url":"https:\/\/www.datanovia.com\/en\/","name":"Datanovia","description":"Data Mining and Statistics for Decision Support","publisher":{"@id":"https:\/\/www.datanovia.com\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.datanovia.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.datanovia.com\/en\/#organization","name":"Datanovia","url":"https:\/\/www.datanovia.com\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","width":98,"height":99,"caption":"Datanovia"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/"}}]}},"multi-rating":{"mr_rating_results":[]},"_links":{"self":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8078","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons"}],"about":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/types\/dt_lessons"}],"author":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/comments?post=8078"}],"version-history":[{"count":1,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8078\/revisions"}],"predecessor-version":[{"id":8136,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8078\/revisions\/8136"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media\/7884"}],"wp:attachment":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media?parent=8078"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}