Lesson Archives

  1. Stripcharts are also known as one dimensional scatter plots. These plots are suitable compared to box plots when sample sizes are small. This article describes how to create and customize Stripcharts using the ggplot2 R package.
  2. A Dot Plot is used to visualize the distribution of the data. This chart creates stacked dots, where each dot represents one observation. Summary statistics are usually added to dotplots for indicating, for example, the median of the data and the interquartile range. This article describes how to create and customize Dot Plots using the ggplot2 R package.
  3. A Violin Plot is used to visualize the distribution of the data and its probability density. This chart is a combination of a Box plot and a Density Plot that is rotated and placed on each side, to display the distribution shape of the data. A Violin Plot shows more information than a Box Plot. For example, in a violin plot, you can see whether the distribution of the data is bimodal or multimodal. This article describes how to create and customize violin plots using the ggplot2 R package.
  4. Boxplots are used to visualize the distribution of a grouped continuous variable through their quartiles. You will learn how to create and customize boxplots using the ggplot2 R package.
  5. A Scatter plot is used to display the relationship between two continuous variables x and y. This article describes how to create scatter plots in R using the ggplot2 package. You will learn how to: 1) Color points by groups; 2) Create bubble charts; 3) Add regression line to a scatter plot.
  6. This article presents the basics of ggplot2. The key ggplot graphic functions are presented. You will learn how to build a ggplot piece by piece, as well as, how to customize and export the plot
  7. The density-based clustering (DBSCAN is a partitioning method that has been introduced in Ester et al. (1996). It can find out clusters of different shapes and sizes from data containing noise and outliers. In this chapter, we’ll describe the DBSCAN algorithm and demonstrate how to compute DBSCAN using the fpc R package.
  8. In model-based clustering, the data are viewed as coming from a distribution that is mixture of two ore more clusters. It finds best fit of models to data and estimates the number of clusters. In this chapter, we illustrate model-based clustering using the R package mclust.