{"id":8079,"date":"2018-10-25T02:07:59","date_gmt":"2018-10-25T00:07:59","guid":{"rendered":"https:\/\/www.datanovia.com\/en\/?post_type=dt_lessons&#038;p=8079"},"modified":"2018-10-25T02:07:59","modified_gmt":"2018-10-25T00:07:59","slug":"fuzzy-clustering-essentials","status":"publish","type":"dt_lessons","link":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/","title":{"rendered":"Fuzzy Clustering Essentials"},"content":{"rendered":"<div id=\"rdoc\">\n<p>The <strong>fuzzy clustering<\/strong> is considered as soft clustering, in which each element has a probability of belonging to each cluster. In other words, each element has a set of membership coefficients corresponding to the degree of being in a given cluster.<\/p>\n<p>This is different from k-means and k-medoid clustering, where each object is affected exactly to one cluster. K-means and k-medoids clustering are known as hard or non-fuzzy clustering.<\/p>\n<p>In fuzzy clustering, points close to the center of a cluster, may be in the cluster to a higher degree than points in the edge of a cluster. The degree, to which an element belongs to a given cluster, is a numerical value varying from 0 to 1.<\/p>\n<p>The <strong>fuzzy c-means<\/strong> (FCM) algorithm is one of the most widely used fuzzy clustering algorithms. The centroid of a cluster is calculated as the mean of all points, weighted by their degree of belonging to the cluster:<\/p>\n<div class=\"block\">\n<p>\nIn this article, we\u2019ll describe how to compute fuzzy clustering using the R software.\n<\/p>\n<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div class='dt-sc-ico-content type1'><div class='custom-icon' ><a href='https:\/\/www.datanovia.com\/en\/product\/practical-guide-to-cluster-analysis-in-r\/' target='_blank'><span class='fa fa-book'><\/span><\/a><\/div><h4><a href='https:\/\/www.datanovia.com\/en\/product\/practical-guide-to-cluster-analysis-in-r\/' target='_blank'> Related Book <\/a><\/h4>Practical Guide to Cluster Analysis in R<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div id=\"required-r-packages\" class=\"section level2\">\n<h2>Required R packages<\/h2>\n<p>We\u2019ll use the following R packages: 1) <em>cluster<\/em> for computing fuzzy clustering and 2) <em>factoextra<\/em> for visualizing clusters.<\/p>\n<\/div>\n<div id=\"computing-fuzzy-clustering\" class=\"section level2\">\n<h2>Computing fuzzy clustering<\/h2>\n<p>The function <em>fanny<\/em>() [<em>cluster<\/em> R package] can be used to compute fuzzy clustering. <strong>FANNY<\/strong> stands for <strong>fuzzy analysis clustering<\/strong>. A simplified format is:<\/p>\n<pre class=\"r\"><code>fanny(x, k, metric = &quot;euclidean&quot;, stand = FALSE)<\/code><\/pre>\n<div class=\"block\">\n<ul>\n<li>\n<strong>x<\/strong>: A data matrix or data frame or dissimilarity matrix\n<\/li>\n<li>\n<strong>k<\/strong>: The desired number of clusters to be generated\n<\/li>\n<li>\n<strong>metric<\/strong>: Metric for calculating dissimilarities between observations\n<\/li>\n<li>\n<strong>stand<\/strong>: If TRUE, variables are standardized before calculating the dissimilarities\n<\/li>\n<\/ul>\n<\/div>\n<p>The function <em>fanny<\/em>() returns an object including the following components:<\/p>\n<ul>\n<li><strong>membership<\/strong>: matrix containing the degree to which each observation belongs to a given cluster. Column names are the clusters and rows are observations<\/li>\n<li><strong>coeff<\/strong>: Dunn\u2019s partition coefficient F(k) of the clustering, where k is the number of clusters. F(k) is the sum of all squared membership coefficients, divided by the number of observations. Its value is between 1\/k and 1. The normalized form of the coefficient is also given. It is defined as <span class=\"math inline\">\\((F(k) - 1\/k) \/ (1 - 1\/k)\\)<\/span>, and ranges between 0 and 1. A low value of Dunn\u2019s coefficient indicates a very fuzzy clustering, whereas a value close to 1 indicates a near-crisp clustering.<\/li>\n<li><strong>clustering<\/strong>: the clustering vector containing the nearest crisp grouping of observations<\/li>\n<\/ul>\n<p>For example, the R code below applies fuzzy clustering on the USArrests data set:<\/p>\n<pre class=\"r\"><code>library(cluster)\r\ndf &lt;- scale(USArrests)     # Standardize the data\r\nres.fanny &lt;- fanny(df, 2)  # Compute fuzzy clustering with k = 2<\/code><\/pre>\n<p>The different components can be extracted using the code below:<\/p>\n<pre class=\"r\"><code>head(res.fanny$membership, 3) # Membership coefficients<\/code><\/pre>\n<pre><code>##          [,1]  [,2]\r\n## Alabama 0.664 0.336\r\n## Alaska  0.610 0.390\r\n## Arizona 0.686 0.314<\/code><\/pre>\n<pre class=\"r\"><code>res.fanny$coeff # Dunn&#39;s partition coefficient<\/code><\/pre>\n<pre><code>## dunn_coeff normalized \r\n##      0.555      0.109<\/code><\/pre>\n<pre class=\"r\"><code>head(res.fanny$clustering) # Observation groups<\/code><\/pre>\n<pre><code>##    Alabama     Alaska    Arizona   Arkansas California   Colorado \r\n##          1          1          1          2          1          1<\/code><\/pre>\n<p>To visualize observation groups, use the function <em>fviz_cluster<\/em>() [<em>factoextra<\/em> package]:<\/p>\n<pre class=\"r\"><code>library(factoextra)\r\nfviz_cluster(res.fanny, ellipse.type = &quot;norm&quot;, repel = TRUE,\r\n             palette = &quot;jco&quot;, ggtheme = theme_minimal(),\r\n             legend = &quot;right&quot;)<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/021-fuzzy-clustering-visualize-1.png\" width=\"518.4\" \/><\/p>\n<p>To evaluate the goodnesss of the clustering results, plot the silhouette coefficient as follow:<\/p>\n<pre class=\"r\"><code>fviz_silhouette(res.fanny, palette = &quot;jco&quot;,\r\n                ggtheme = theme_minimal())<\/code><\/pre>\n<pre><code>##   cluster size ave.sil.width\r\n## 1       1   22          0.32\r\n## 2       2   28          0.44<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/021-fuzzy-clustering-silhouette-1.png\" width=\"518.4\" \/><\/p>\n<\/div>\n<div id=\"summary\" class=\"section level2\">\n<h2>Summary<\/h2>\n<p>Fuzzy clustering is an alternative to k-means clustering, where each data point has membership coefficient to each cluster. Here, we demonstrated how to compute and visualize fuzzy clustering using the combination of <em>cluster<\/em> and <em>factoextra<\/em> R packages.<\/p>\n<\/div>\n<\/div>\n<p><!--end rdoc--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Fuzzy clustering is also known as soft method. Standard clustering (K-means, PAM) approaches produce partitions, in which each observation belongs to only one cluster. This is known as hard clustering. In Fuzzy clustering, items can be a member of more than one cluster. Each item has a set of membership coefficients corresponding to the degree of being in a given cluster. In this article, we\u2019ll describe how to compute fuzzy clustering using the R software.<\/p>\n","protected":false},"author":1,"featured_media":7917,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","class_list":["post-8079","dt_lessons","type-dt_lessons","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Fuzzy Clustering Essentials - Datanovia<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Fuzzy Clustering Essentials - Datanovia\" \/>\n<meta property=\"og:description\" content=\"Fuzzy clustering is also known as soft method. Standard clustering (K-means, PAM) approaches produce partitions, in which each observation belongs to only one cluster. This is known as hard clustering. In Fuzzy clustering, items can be a member of more than one cluster. Each item has a set of membership coefficients corresponding to the degree of being in a given cluster. In this article, we\u2019ll describe how to compute fuzzy clustering using the R software.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/\" \/>\n<meta property=\"og:site_name\" content=\"Datanovia\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/\",\"name\":\"Fuzzy Clustering Essentials - Datanovia\",\"isPartOf\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg\",\"datePublished\":\"2018-10-25T00:07:59+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#primaryimage\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg\",\"width\":1024,\"height\":512},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.datanovia.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lessons\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Fuzzy Clustering Essentials\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"name\":\"Datanovia\",\"description\":\"Data Mining and Statistics for Decision Support\",\"publisher\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.datanovia.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\",\"name\":\"Datanovia\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"width\":98,\"height\":99,\"caption\":\"Datanovia\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Fuzzy Clustering Essentials - Datanovia","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/","og_locale":"en_US","og_type":"article","og_title":"Fuzzy Clustering Essentials - Datanovia","og_description":"Fuzzy clustering is also known as soft method. Standard clustering (K-means, PAM) approaches produce partitions, in which each observation belongs to only one cluster. This is known as hard clustering. In Fuzzy clustering, items can be a member of more than one cluster. Each item has a set of membership coefficients corresponding to the degree of being in a given cluster. In this article, we\u2019ll describe how to compute fuzzy clustering using the R software.","og_url":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/","og_site_name":"Datanovia","og_image":[{"width":1024,"height":512,"url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/","url":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/","name":"Fuzzy Clustering Essentials - Datanovia","isPartOf":{"@id":"https:\/\/www.datanovia.com\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#primaryimage"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#primaryimage"},"thumbnailUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg","datePublished":"2018-10-25T00:07:59+00:00","breadcrumb":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#primaryimage","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/IMG_0789.jpg","width":1024,"height":512},{"@type":"BreadcrumbList","@id":"https:\/\/www.datanovia.com\/en\/lessons\/fuzzy-clustering-essentials\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.datanovia.com\/en\/"},{"@type":"ListItem","position":2,"name":"Lessons","item":"https:\/\/www.datanovia.com\/en\/lessons\/"},{"@type":"ListItem","position":3,"name":"Fuzzy Clustering Essentials"}]},{"@type":"WebSite","@id":"https:\/\/www.datanovia.com\/en\/#website","url":"https:\/\/www.datanovia.com\/en\/","name":"Datanovia","description":"Data Mining and Statistics for Decision Support","publisher":{"@id":"https:\/\/www.datanovia.com\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.datanovia.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.datanovia.com\/en\/#organization","name":"Datanovia","url":"https:\/\/www.datanovia.com\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","width":98,"height":99,"caption":"Datanovia"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/"}}]}},"multi-rating":{"mr_rating_results":[]},"_links":{"self":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8079","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons"}],"about":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/types\/dt_lessons"}],"author":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/comments?post=8079"}],"version-history":[{"count":0,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8079\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media\/7917"}],"wp:attachment":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media?parent=8079"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}