Data Clustering Basics

Data Clustering Basics

Data Clustering Basics

Data clustering consists of data mining methods for identifying groups of similar objects in a multivariate data sets collected from fields such as marketing, bio-medical and geo-spatial.

Similarity between observations (or individuals) is defined using some inter-observation distance measures including Euclidean and correlation-based distance measures.

There are different types of data clustering techniques, including:

  • Partitioning clustering approaches, which subdivide the data into a set of k groups. One of the popular partitioning method is the k-means clustering
  • Hierarchical clustering approaches, which identify groups in the data without subdividing it.

This course presents the basics to know for clustering analysis in R. You will learn:

  • Data preparation and essential R packages for cluster analysis
  • Clustering distance measures essentials
  • Quick start R code to perform k-means clustering and hierarchical clustering in R.

Related Book

Practical Guide to Cluster Analysis in R


Comment ( 1 )

  • S Kumari

    nice work, we are thankful to you

Give a comment

Want to post an issue with R? If yes, please make sure you have read this: How to Include Reproducible R Script Examples in Datanovia Comments