Blog

We provide practical tutorials on data mining, visualization and statistics for decision making.

Version: Français

This article describes how to interpret the kappa coefficient, which is used to assess the inter-rater reliability or agreement. In most applications, there is usually more interest in the magnitude of kappa than in the statistical significance of kappa. The following classifications has been...

Kappa Coefficient Interpretation

This article describes how to interpret the kappa coefficient, which is used to assess the inter-rater reliability or agreement. In most applications, there is usually more interest in the magnitude of kappa than in the statistical significance of kappa. The following classifications has been...

Different distance measures are available for clustering analysis. This article describes how to perform clustering in R using correlation as distance metrics. Contents: Prerequisites Demo data Draw heatmaps using pheatmap Draw heatmaps using gplots Summary See also Prerequisites The following R packages will be...

Clustering using Correlation as Distance Measures in R

Different distance measures are available for clustering analysis. This article describes how to perform clustering in R using correlation as distance metrics. Contents: Prerequisites Demo data Draw heatmaps using pheatmap Draw heatmaps using gplots Summary See also Prerequisites The following R packages will be...

This article describes how to extract text from PDF in R using the pdftools package. Contents: Installation Load the package Extract the PDF text content Render the pdf pages as images Summary Installation For MAC OSX and Windows, you can use the following code...

Extract Text from PDF in R

This article describes how to extract text from PDF in R using the pdftools package. Contents: Installation Load the package Extract the PDF text content Render the pdf pages as images Summary Installation For MAC OSX and Windows, you can use the following code...

This article describes how to quickly display summary statistics using the R package skimr. skimr handles different data types and returns a skim_df object which can be included in a tidyverse pipeline or displayed nicely for the human reader. Key features of skimr: Provides...

Display a Beautiful Summary Statistics in R using Skimr Package

This article describes how to quickly display summary statistics using the R package skimr. skimr handles different data types and returns a skim_df object which can be included in a tidyverse pipeline or displayed nicely for the human reader. Key features of skimr: Provides...

This article presents the fs R package, which provides a cross-platform, uniform interface to file system operations. fs functions are divided into four main categories: path_ for manipulating and constructing paths file_ for files dir_ for directories link_ for links Contents: Prerequistes Some Key...

How to Easily Manipulate Files and Directories in R

This article presents the fs R package, which provides a cross-platform, uniform interface to file system operations. fs functions are divided into four main categories: path_ for manipulating and constructing paths file_ for files dir_ for directories link_ for links Contents: Prerequistes Some Key...

This article presents how to easily highlight a ggplot using the gghighlight package. Contents: Prerequisites Line plot Histogram Scatter plot Bar plot Prerequisites Load required packages and set the default ggplot2 theme to theme_bw(). library(tidyverse) library(gghighlight) theme_set(theme_bw()) Line plot Basic line plot p <-...

gghighlight: Easy Way to Highlight a GGPlot in R

This article presents how to easily highlight a ggplot using the gghighlight package. Contents: Prerequisites Line plot Histogram Scatter plot Bar plot Prerequisites Load required packages and set the default ggplot2 theme to theme_bw(). library(tidyverse) library(gghighlight) theme_set(theme_bw()) Line plot Basic line plot p <-...

This article describes how to create animation in R using the gganimate R package. gganimate is an extension of the ggplot2 package for creating animated ggplots. It provides a range of new functionality that can be added to the plot object in order to...

gganimate: How to Create Plots with Beautiful Animation in R

This article describes how to create animation in R using the gganimate R package. gganimate is an extension of the ggplot2 package for creating animated ggplots. It provides a range of new functionality that can be added to the plot object in order to...

This article describes how create easily an interactive web framework for exploring data in R using the datadigest package. This tool provides a concise summary of every variable in a data frame and includes interactive features such as real-time filters, grouping, and highlighting. This...

Simple Interactive Framework for Exploring Data in R

This article describes how create easily an interactive web framework for exploring data in R using the datadigest package. This tool provides a concise summary of every variable in a data frame and includes interactive features such as real-time filters, grouping, and highlighting. This...

This article shows how to visualize one numeric variable against multiple others. Prerequisites Load required R package and set the default theme to theme_bw() : library(tidyverse) theme_set( theme_bw() + theme(legend.position = "top") ) Data preparation Demo data: head(iris, 3) ## Sepal.Length Sepal.Width Petal.Length Petal.Width...

How to Plot One Variable against Multiple Others

This article shows how to visualize one numeric variable against multiple others. Prerequisites Load required R package and set the default theme to theme_bw() : library(tidyverse) theme_set( theme_bw() + theme(legend.position = "top") ) Data preparation Demo data: head(iris, 3) ## Sepal.Length Sepal.Width Petal.Length Petal.Width...

You will learn how to plot all variables in a data frame using the ggplot2 R package. Prerequisites Load required R package and set the default theme to theme_minimal() : library(tidyverse) theme_set( theme_minimal() + theme(legend.position = "top") ) Data preparation Demo data: head(iris, 3)...

How to Plot All Variables in a Dataset

You will learn how to plot all variables in a data frame using the ggplot2 R package. Prerequisites Load required R package and set the default theme to theme_minimal() : library(tidyverse) theme_set( theme_minimal() + theme(legend.position = "top") ) Data preparation Demo data: head(iris, 3)...