This chapter describes a cluster analysis example using R software. We provide a quick start R code to compute and visualize K-means and hierarchical clustering. In this article, we describe the common distance measures used to compute distance matrix for cluster analysis. We also provide R codes for computing and visualizing distances. This chapter introduces how to prepare your data for cluster analysis and describes the essential R package for cluster analysis.
Compute Summary Statistics in R
This tutorial introduces how to easily compute statistcal summaries in R using the dplyr package. You will learn, how to compute summary statistics for ungrouped data, as well as, for data that are grouped by one or multiple variables. This tutorial describes how to compute and add new variables to a data frame in R. You will learn how to rename a data frame columns in R. This tutorial describes how to reorder rows, in your data table, by the value of one or more variables. You will learn how to easily sort a data frame rows in ascending and descending orders. You will learn how to identify and to remove duplicate data using R base and dplyr functions. This tutorial describes how to subset or extract data frame rows based on certain criteria. Additionally, we'll describe how to subset a random number or fraction of rows. You will also learn how to remove rows with missing values in a given column. You will learn how to select data frame columns by names and position. We’ll also show how to remove columns from a data frame.