Data Manipulation in R

Reorder Data Frame Rows in R

This tutorial describes how to reorder (i.e., sort) rows, in your data table, by the value of one or more columns (i.e., variables).

You will learn how to easily:

  • Sort a data frame rows in ascending order (from low to high) using the R function arrange() [dplyr package]
  • Sort rows in descending order (from high to low) using arrange() in combination with the function desc() [dplyr package]

Reorder Data Frame Rows by Variables in R



Contents:

Required packages

Load the tidyverse packages, which include dplyr:

library(tidyverse)

Demo dataset

We’ll use the R built-in iris data set, which we start by converting into a tibble data frame (tbl_df) for easier data analysis.

my_data <- as_tibble(iris)
my_data
## # A tibble: 150 x 5
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
##          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
## 1          5.1         3.5          1.4         0.2 setosa 
## 2          4.9         3            1.4         0.2 setosa 
## 3          4.7         3.2          1.3         0.2 setosa 
## 4          4.6         3.1          1.5         0.2 setosa 
## 5          5           3.6          1.4         0.2 setosa 
## 6          5.4         3.9          1.7         0.4 setosa 
## # ... with 144 more rows

Arrange rows

The dplyr function arrange() can be used to reorder (or sort) rows by one or more variables.

  • Reorder rows by Sepal.Length in ascending order
my_data %>% arrange(Sepal.Length)
## # A tibble: 150 x 5
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
##          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
## 1          4.3         3            1.1         0.1 setosa 
## 2          4.4         2.9          1.4         0.2 setosa 
## 3          4.4         3            1.3         0.2 setosa 
## 4          4.4         3.2          1.3         0.2 setosa 
## 5          4.5         2.3          1.3         0.3 setosa 
## 6          4.6         3.1          1.5         0.2 setosa 
## # ... with 144 more rows
  • Reorder rows by Sepal.Length in descending order. Use the function desc():
my_data %>% arrange(desc(Sepal.Length))
## # A tibble: 150 x 5
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species  
##          <dbl>       <dbl>        <dbl>       <dbl> <fct>    
## 1          7.9         3.8          6.4         2   virginica
## 2          7.7         3.8          6.7         2.2 virginica
## 3          7.7         2.6          6.9         2.3 virginica
## 4          7.7         2.8          6.7         2   virginica
## 5          7.7         3            6.1         2.3 virginica
## 6          7.6         3            6.6         2.1 virginica
## # ... with 144 more rows

Instead of using the function desc(), you can prepend the sorting variable by a minus sign to indicate descending order, as follow.

arrange(my_data, -Sepal.Length)
  • Reorder rows by multiple variables: Sepal.Length and Sepal.width
my_data %>% arrange(Sepal.Length, Sepal.Width)

If the data contain missing values, they will always come at the end.

Summary

In this article, we describe how to sort data frame rows using the function arrange() [dplyr package].



Identify and Remove Duplicate Data in R (Prev Lesson)
(Next Lesson) Rename Data Frame Columns in R
Back to Data Manipulation in R

No Comments

Give a comment

Want to post an issue with R? If yes, please make sure you have read this: How to Include Reproducible R Script Examples in Datanovia Comments

Teacher
Alboukadel Kassambara
Role : Founder of Datanovia
Read More