{"id":8081,"date":"2018-10-25T02:21:31","date_gmt":"2018-10-25T00:21:31","guid":{"rendered":"https:\/\/www.datanovia.com\/en\/?post_type=dt_lessons&#038;p=8081"},"modified":"2018-10-25T02:25:09","modified_gmt":"2018-10-25T00:25:09","slug":"dbscan-density-based-clustering-essentials","status":"publish","type":"dt_lessons","link":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/","title":{"rendered":"DBSCAN: Density-Based Clustering Essentials"},"content":{"rendered":"<div id=\"rdoc\">\n<p><strong>DBSCAN<\/strong> (<strong>Density-Based Spatial Clustering and Application with Noise<\/strong>), is a <strong>density-based clusering<\/strong> algorithm <span class=\"citation\">(Ester et al. 1996)<\/span>, which can be used to identify clusters of any shape in a data set containing noise and outliers.<\/p>\n<p>The basic idea behind the density-based clustering approach is derived from a human intuitive clustering method. For instance, by looking at the figure below, one can easily identify four clusters along with several points of noise, because of the differences in the density of points.<\/p>\n<p>Clusters are dense regions in the data space, separated by regions of lower density of points. The DBSCAN algorithm is based on this intuitive notion of \u201cclusters\u201d and \u201cnoise\u201d. The key idea is that for each point of a cluster, the neighborhood of a given radius has to contain at least a minimum number of points.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/images\/dbscan-idea.png\" alt=\"DBSCAN idea\" \/> (From Ester et al. 1996)<\/p>\n<div class=\"block\">\n<p>In this chapter, we\u2019ll describe the DBSCAN algorithm and demonstrate how to compute DBSCAN using the <em>fpc<\/em> R package.<\/p>\n<\/div>\n<p>Contents:<\/p>\n<div id=\"TOC\">\n<ul>\n<li><a href=\"#why-dbscan\">Why DBSCAN??<\/a><\/li>\n<li><a href=\"#algorithm\">Algorithm<\/a><\/li>\n<li><a href=\"#advantages\">Advantages<\/a><\/li>\n<li><a href=\"#parameter-estimation\">Parameter estimation<\/a><\/li>\n<li><a href=\"#computing-dbscan\">Computing DBSCAN<\/a><\/li>\n<li><a href=\"#method-for-determining-the-optimal-eps-value\">Method for determining the optimal eps value<\/a><\/li>\n<li><a href=\"#cluster-predictions-with-dbscan-algorithm\">Cluster predictions with DBSCAN algorithm<\/a><\/li>\n<li><a href=\"#references\">References<\/a><\/li>\n<\/ul>\n<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div class='dt-sc-ico-content type1'><div class='custom-icon' ><a href='https:\/\/www.datanovia.com\/en\/product\/practical-guide-to-cluster-analysis-in-r\/' target='_blank'><span class='fa fa-book'><\/span><\/a><\/div><h4><a href='https:\/\/www.datanovia.com\/en\/product\/practical-guide-to-cluster-analysis-in-r\/' target='_blank'> Related Book <\/a><\/h4>Practical Guide to Cluster Analysis in R<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div id=\"why-dbscan\" class=\"section level2\">\n<h2>Why DBSCAN??<\/h2>\n<p>Partitioning methods (K-means, PAM clustering) and hierarchical clustering are suitable for finding spherical-shaped clusters or convex clusters. In other words, they work well only for compact and well separated clusters. Moreover, they are also severely affected by the presence of noise and outliers in the data.<\/p>\n<p>Unfortunately, real life data can contain: i) clusters of arbitrary shape such as those shown in the figure below (oval, linear and \u201cS\u201d shape clusters); ii) many outliers and noise.<\/p>\n<p>The figure below shows a data set containing nonconvex clusters and outliers\/noises. The simulated data set <em>multishapes<\/em> [in <em>factoextra<\/em> package] is used.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/023-dbscan-density-based-clustering-data-dbscan-1.png\" width=\"336\" \/><\/p>\n<p>The plot above contains 5 clusters and outliers, including:<\/p>\n<ul>\n<li>2 ovales clusters<\/li>\n<li>2 linear clusters<\/li>\n<li>1 compact cluster<\/li>\n<\/ul>\n<p>Given such data, k-means algorithm has difficulties for identifying theses clusters with arbitrary shapes. To illustrate this situation, the following R code computes k-means algorithm on the multishapes data set. The function <em>fviz_cluster<\/em>()[<em>factoextra<\/em> package] is used to visualize the clusters.<\/p>\n<p>First, install factoextra: install.packages(\u201cfactoextra\u201d); then compute and visualize k-means clustering using the data set multishapes:<\/p>\n<pre class=\"r\"><code>library(factoextra)\r\ndata(\"multishapes\")\r\ndf &lt;- multishapes[, 1:2]\r\nset.seed(123)\r\nkm.res &lt;- kmeans(df, 5, nstart = 25)\r\nfviz_cluster(km.res, df,  geom = \"point\", \r\n             ellipse= FALSE, show.clust.cent = FALSE,\r\n             palette = \"jco\", ggtheme = theme_classic())<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/023-dbscan-density-based-clustering-k-means-multishapes-1.png\" width=\"336\" \/><\/p>\n<div class=\"success\">\n<p>We know there are 5 five clusters in the data, but it can be seen that k-means method inaccurately identify the 5 clusters.<\/p>\n<\/div>\n<\/div>\n<div id=\"algorithm\" class=\"section level2\">\n<h2>Algorithm<\/h2>\n<p>The goal is to identify dense regions, which can be measured by the number of objects close to a given point.<\/p>\n<p>Two important parameters are required for DBSCAN: <em>epsilon<\/em> (\u201ceps\u201d) and <em>minimum points<\/em> (\u201cMinPts\u201d). The parameter <em>eps<\/em> defines the radius of neighborhood around a point x. It\u2019s called called the <span class=\"math inline\">\\(\\epsilon\\)<\/span>-neighborhood of x. The parameter <em>MinPts<\/em> is the minimum number of neighbors within \u201ceps\u201d radius.<\/p>\n<p>Any point x in the data set, with a neighbor count greater than or equal to <em>MinPts<\/em>, is marked as a <em>core point<\/em>. We say that x is <em>border point<\/em>, if the number of its neighbors is less than MinPts, but it belongs to the <span class=\"math inline\">\\(\\epsilon\\)<\/span>-neighborhood of some core point z. Finally, if a point is neither a core nor a border point, then it is called a noise point or an outlier.<\/p>\n<p>The figure below shows the different types of points (core, border and outlier points) using MinPts = 6. Here x is a core point because <span class=\"math inline\">\\(neighbours_\\epsilon(x) = 6\\)<\/span>, y is a border point because <span class=\"math inline\">\\(neighbours_\\epsilon(y) &lt; MinPts\\)<\/span>, but it belongs to the <span class=\"math inline\">\\(\\epsilon\\)<\/span>-neighborhood of the core point x. Finally, z is a noise point.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/images\/dbscan-principle.png\" alt=\"DBSCAN principle\" \/><\/p>\n<p>We start by defining 3 terms, required for understanding the DBSCAN algorithm:<\/p>\n<ul>\n<li><em>Direct density reachable<\/em>: A point \u201cA\u201d is directly density reachable from another point \u201cB\u201d if: i) \u201cA\u201d is in the <span class=\"math inline\">\\(\\epsilon\\)<\/span>-neighborhood of \u201cB\u201d and ii) \u201cB\u201d is a core point.<\/li>\n<li><em>Density reachable<\/em>: A point \u201cA\u201d is density reachable from \u201cB\u201d if there are a set of core points leading from \u201cB\u201d to \u201cA.<\/li>\n<li><em>Density connected<\/em>: Two points \u201cA\u201d and \u201cB\u201d are density connected if there are a core point \u201cC\u201d, such that both \u201cA\u201d and \u201cB\u201d are density reachable from \u201cC\u201d.<\/li>\n<\/ul>\n<p>A density-based cluster is defined as a group of density connected points. The algorithm of density-based clustering (DBSCAN) works as follow:<\/p>\n<div class=\"block\">\n<ol style=\"list-style-type: decimal;\">\n<li>For each point <span class=\"math inline\"><em>x<\/em><sub><em>i<\/em><\/sub><\/span>, compute the distance between <span class=\"math inline\"><em>x<\/em><sub><em>i<\/em><\/sub><\/span> and the other points. Finds all neighbor points within distance <em>eps<\/em> of the starting point (<span class=\"math inline\"><em>x<\/em><sub><em>i<\/em><\/sub><\/span>). Each point, with a neighbor count greater than or equal to <em>MinPts<\/em>, is marked as <em>core point<\/em> or <em>visited<\/em>.<\/li>\n<li>For each <em>core point<\/em>, if it\u2019s not already assigned to a cluster, create a new cluster. Find recursively all its density connected points and assign them to the same cluster as the core point.<\/li>\n<li>Iterate through the remaining unvisited points in the data set.<\/li>\n<\/ol>\n<p>Those points that do not belong to any cluster are treated as outliers or noise.<\/p>\n<\/div>\n<\/div>\n<div id=\"advantages\" class=\"section level2\">\n<h2>Advantages<\/h2>\n<ol style=\"list-style-type: decimal;\">\n<li>Unlike K-means, DBSCAN does not require the user to specify the number of clusters to be generated<\/li>\n<li>DBSCAN can find any shape of clusters. The cluster doesn\u2019t have to be circular.<\/li>\n<li>DBSCAN can identify outliers<\/li>\n<\/ol>\n<\/div>\n<div id=\"parameter-estimation\" class=\"section level2\">\n<h2>Parameter estimation<\/h2>\n<ul>\n<li>MinPts: The larger the data set, the larger the value of minPts should be chosen. minPts must be chosen at least 3.<\/li>\n<li><span class=\"math inline\">\\(\\epsilon\\)<\/span>: The value for <span class=\"math inline\">\\(\\epsilon\\)<\/span> can then be chosen by using a k-distance graph, plotting the distance to the k = minPts nearest neighbor. Good values of <span class=\"math inline\">\\(\\epsilon\\)<\/span> are where this plot shows a strong bend.<\/li>\n<\/ul>\n<\/div>\n<div id=\"computing-dbscan\" class=\"section level2\">\n<h2>Computing DBSCAN<\/h2>\n<p>Here, we\u2019ll use the R package <em>fpc<\/em> to compute DBSCAN. It\u2019s also possible to use the package <em>dbscan<\/em>, which provides a faster re-implementation of DBSCAN algorithm compared to the fpc package.<\/p>\n<p>We\u2019ll also use the <em>factoextra<\/em> package for visualizing clusters.<\/p>\n<p>First, install the packages as follow:<\/p>\n<pre class=\"r\"><code>install.packages(\"fpc\")\r\ninstall.packages(\"dbscan\")\r\ninstall.packages(\"factoextra\")<\/code><\/pre>\n<p>The R code below computes and visualizes DBSCAN using multishapes data set [factoextra R package]:<\/p>\n<pre class=\"r\"><code># Load the data \r\ndata(\"multishapes\", package = \"factoextra\")\r\ndf &lt;- multishapes[, 1:2]\r\n\r\n# Compute DBSCAN using fpc package\r\nlibrary(\"fpc\")\r\nset.seed(123)\r\ndb &lt;- fpc::dbscan(df, eps = 0.15, MinPts = 5)\r\n\r\n# Plot DBSCAN results\r\nlibrary(\"factoextra\")\r\nfviz_cluster(db, data = df, stand = FALSE,\r\n             ellipse = FALSE, show.clust.cent = FALSE,\r\n             geom = \"point\",palette = \"jco\", ggtheme = theme_classic())<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/023-dbscan-density-based-clustering-density-based-clustering-1.png\" width=\"336\" \/><\/p>\n<div class=\"notice\">\n<p>Note that, the function <em>fviz_cluster<\/em>() uses different point symbols for core points (i.e, seed points) and border points. Black points correspond to outliers. You can play with <em>eps<\/em> and <em>MinPts<\/em> for changing cluster configurations.<\/p>\n<\/div>\n<div class=\"success\">\n<p>It can be seen that DBSCAN performs better for these data sets and can identify the correct set of clusters compared to k-means algorithms.<\/p>\n<\/div>\n<p>The result of the <em>fpc::dbscan<\/em>() function can be displayed as follow:<\/p>\n<pre class=\"r\"><code>print(db)<\/code><\/pre>\n<pre><code>## dbscan Pts=1100 MinPts=5 eps=0.15\r\n##         0   1   2   3  4  5\r\n## border 31  24   1   5  7  1\r\n## seed    0 386 404  99 92 50\r\n## total  31 410 405 104 99 51<\/code><\/pre>\n<p>In the table above, column names are cluster number. Cluster 0 corresponds to outliers (black points in the DBSCAN plot). The function <em>print.dbscan<\/em>() shows a statistic of the number of points belonging to the clusters that are seeds and border points.<\/p>\n<pre class=\"r\"><code># Cluster membership. Noise\/outlier observations are coded as 0\r\n# A random subset is shown\r\ndb$cluster[sample(1:1089, 20)]<\/code><\/pre>\n<pre><code>##  [1] 1 3 2 4 3 1 2 4 2 2 2 2 2 2 1 4 1 1 1 0<\/code><\/pre>\n<p>DBSCAN algorithm requires users to specify the optimal <em>eps<\/em> values and the parameter <em>MinPts<\/em>. In the R code above, we used <em>eps = 0.15<\/em> and <em>MinPts = 5<\/em>. One limitation of DBSCAN is that it is sensitive to the choice of <span class=\"math inline\">\\(\\epsilon\\)<\/span>, in particular if clusters have different densities. If <span class=\"math inline\">\\(\\epsilon\\)<\/span> is too small, sparser clusters will be defined as noise. If <span class=\"math inline\">\\(\\epsilon\\)<\/span> is too large, denser clusters may be merged together. This implies that, if there are clusters with different local densities, then a single <span class=\"math inline\">\\(\\epsilon\\)<\/span> value may not suffice.<\/p>\n<p>A natural question is:<\/p>\n<div class=\"block\">\n<p>How to define the optimal value of ?<\/p>\n<\/div>\n<\/div>\n<div id=\"method-for-determining-the-optimal-eps-value\" class=\"section level2\">\n<h2>Method for determining the optimal eps value<\/h2>\n<p>The method proposed here consists of computing the k-nearest neighbor distances in a matrix of points.<\/p>\n<p>The idea is to calculate, the average of the distances of every point to its k nearest neighbors. The value of k will be specified by the user and corresponds to <em>MinPts<\/em>.<\/p>\n<p>Next, these k-distances are plotted in an ascending order. The aim is to determine the \u201cknee\u201d, which corresponds to the optimal <em>eps<\/em> parameter.<\/p>\n<p>A knee corresponds to a threshold where a sharp change occurs along the k-distance curve.<\/p>\n<p>The function <em>kNNdistplot<\/em>() [in <em>dbscan<\/em> package] can be used to draw the k-distance plot:<\/p>\n<pre class=\"r\"><code>dbscan::kNNdistplot(df, k =  5)\r\nabline(h = 0.15, lty = 2)<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/005-advanced-clustering\/figures\/023-dbscan-density-based-clustering-k-nearest-neighbor-distance-1.png\" width=\"384\" \/><\/p>\n<div class=\"success\">\n<p>It can be seen that the optimal <em>eps<\/em> value is around a distance of 0.15.<\/p>\n<\/div>\n<\/div>\n<div id=\"cluster-predictions-with-dbscan-algorithm\" class=\"section level2\">\n<h2>Cluster predictions with DBSCAN algorithm<\/h2>\n<p>The function <em>predict.dbscan(object, data, newdata)<\/em> [in <em>fpc<\/em> package] can be used to predict the clusters for the points in <em>newdata<\/em>. For more details, read the documentation (<em>?predict.dbscan<\/em>).<\/p>\n<\/div>\n<div id=\"references\" class=\"section level2 unnumbered\">\n<h2>References<\/h2>\n<div id=\"refs\" class=\"references\">\n<div id=\"ref-ester1996\">\n<p>Ester, Martin, Hans-Peter Kriegel, J\u00f6rg Sander, and Xiaowei Xu. 1996. \u201cA Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.\u201d In, 226\u201331. AAAI Press.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p><!--end rdoc--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The density-based clustering (DBSCAN is a partitioning method that has been introduced in Ester et al. (1996). It can find out clusters of different shapes and sizes from data containing noise and outliers. In this chapter, we\u2019ll describe the DBSCAN algorithm and demonstrate how to compute DBSCAN using the fpc R package.<\/p>\n","protected":false},"author":1,"featured_media":7973,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","class_list":["post-8081","dt_lessons","type-dt_lessons","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>DBSCAN: Density-Based Clustering Essentials - Datanovia<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"DBSCAN: Density-Based Clustering Essentials - Datanovia\" \/>\n<meta property=\"og:description\" content=\"The density-based clustering (DBSCAN is a partitioning method that has been introduced in Ester et al. (1996). It can find out clusters of different shapes and sizes from data containing noise and outliers. In this chapter, we\u2019ll describe the DBSCAN algorithm and demonstrate how to compute DBSCAN using the fpc R package.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/\" \/>\n<meta property=\"og:site_name\" content=\"Datanovia\" \/>\n<meta property=\"article:modified_time\" content=\"2018-10-25T00:25:09+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/\",\"name\":\"DBSCAN: Density-Based Clustering Essentials - Datanovia\",\"isPartOf\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg\",\"datePublished\":\"2018-10-25T00:21:31+00:00\",\"dateModified\":\"2018-10-25T00:25:09+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#primaryimage\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg\",\"width\":1024,\"height\":512},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.datanovia.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lessons\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"DBSCAN: Density-Based Clustering Essentials\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"name\":\"Datanovia\",\"description\":\"Data Mining and Statistics for Decision Support\",\"publisher\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.datanovia.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\",\"name\":\"Datanovia\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"width\":98,\"height\":99,\"caption\":\"Datanovia\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"DBSCAN: Density-Based Clustering Essentials - Datanovia","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/","og_locale":"en_US","og_type":"article","og_title":"DBSCAN: Density-Based Clustering Essentials - Datanovia","og_description":"The density-based clustering (DBSCAN is a partitioning method that has been introduced in Ester et al. (1996). It can find out clusters of different shapes and sizes from data containing noise and outliers. In this chapter, we\u2019ll describe the DBSCAN algorithm and demonstrate how to compute DBSCAN using the fpc R package.","og_url":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/","og_site_name":"Datanovia","article_modified_time":"2018-10-25T00:25:09+00:00","og_image":[{"width":1024,"height":512,"url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/","url":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/","name":"DBSCAN: Density-Based Clustering Essentials - Datanovia","isPartOf":{"@id":"https:\/\/www.datanovia.com\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#primaryimage"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#primaryimage"},"thumbnailUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg","datePublished":"2018-10-25T00:21:31+00:00","dateModified":"2018-10-25T00:25:09+00:00","breadcrumb":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#primaryimage","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/10\/P1020588.JPG.jpg","width":1024,"height":512},{"@type":"BreadcrumbList","@id":"https:\/\/www.datanovia.com\/en\/lessons\/dbscan-density-based-clustering-essentials\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.datanovia.com\/en\/"},{"@type":"ListItem","position":2,"name":"Lessons","item":"https:\/\/www.datanovia.com\/en\/lessons\/"},{"@type":"ListItem","position":3,"name":"DBSCAN: Density-Based Clustering Essentials"}]},{"@type":"WebSite","@id":"https:\/\/www.datanovia.com\/en\/#website","url":"https:\/\/www.datanovia.com\/en\/","name":"Datanovia","description":"Data Mining and Statistics for Decision Support","publisher":{"@id":"https:\/\/www.datanovia.com\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.datanovia.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.datanovia.com\/en\/#organization","name":"Datanovia","url":"https:\/\/www.datanovia.com\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","width":98,"height":99,"caption":"Datanovia"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/"}}]}},"multi-rating":{"mr_rating_results":[]},"_links":{"self":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8081","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons"}],"about":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/types\/dt_lessons"}],"author":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/comments?post=8081"}],"version-history":[{"count":1,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8081\/revisions"}],"predecessor-version":[{"id":8083,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/8081\/revisions\/8083"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media\/7973"}],"wp:attachment":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media?parent=8081"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}