# Frequency Distribution Table with a gt Package

Constructing Frequency and Relative Frequency Distribution Tables with gt and dplyr Packages.

Alier Ëë Reng https://www.quanfistatistics.io/
02-27-2019

The purpose of this article is to demonstrate how to construct both frequency and relative frequency distribution tables with gt and dplyr packages. In this article, I will use Boston Housing dataset from MASS package (specifically the median value of owner-occupied homes in $1000s). Definition A frequency distribution is a table that shows classes or intervals of data entries with counts of the number of entries in each class. The frequency f of a class is the number of data entries in the class. A relative frequency of a class is the portion or percentage of the data that falls in that class. To find the relative frequency of a class, divide the frequency f by the sample size n. Rel. Frequency = $$\frac{Class frequency}{Sample size}$$ = $$\frac{f}{n}$$ The cumulative frequency of a class is the sum of the frequency for that class and all previous classes. The cumulative frequency of the last class is equal to the sample size n. ### Load the packages  library(MASS) library(gt) library(dplyr) library(ggplot2) data(Boston) # Download Boston Housing data from MASS package. housing_data <- factor(cut(Boston$medv, breaks=nclass.Sturges(Boston$medv))) # Create classes of data by using Sturge's formula and convert these classes into factors. house_data <- as.data.frame(table(housing_data)) %>% # Convert dataset into a data frame. transform(relative = round(prop.table(Freq), digits = 3), # Round values to three decimal places. Percentages = round(prop.table(Freq), # Calculate cumFreq, proportions and percentages. digits = 3) * 100) gt_data <- house_data %>% # Subset the dataset. rename(Classes = "housing_data", Rel. Freq = "relative") %>% # Rename the columns. add_row(Classes = "Total", Freq = sum(.$Freq), # Add the row totals.
Rel. Freq = sum(.$Rel. Freq), Percentages = sum(.$Percentages)) %>%
gt() %>% # Initiate gt table.
title = "Frequency and Relative Frequency Distribution Table(s)",
subtitle = "Constructed with gt and dplyr Packages" # Add the title and subtitle.
) %>%
tab_options(heading.background.color = "darkgreen", # Change the heading background color to dark green.
column_labels.background.color = "grey", # Change columns background color to grey.
table.width = "100%") %>% # Maximize table width (100%).
tab_style(
style = cells_styles(
bkgd_color = "black", # Make the background of the last row black.
text_weight = "bold", # Make the font bold.
text_color = "white"), # Change text color to white.
locations = cells_data(
columns = vars(Classes, Freq, Rel. Freq, Percentages),
rows = Classes == "Total" & Freq == 506 &
Rel. Freq == 1.000  # Tell gt package where to apply the above changes.
& Percentages == 100.0)
) %>%
cols_align(align = "center", # Center column values.
columns = TRUE) %>%
tab_source_note(
source_note = "From https://www.rengdatascience.io; by Alier Ëë Reng, 02/26/2019"

gt_data # Print the distribution table.
Frequency and Relative Frequency Distribution Table(s)
Constructed with gt and dplyr Packages
Classes Freq Rel. Freq Percentages
(4.96,9.5] 22 0.043 4.3
(9.5,14] 55 0.109 10.9
(14,18.5] 85 0.168 16.8
(18.5,23] 154 0.304 30.4
(23,27.5] 84 0.166 16.6
(27.5,32] 39 0.077 7.7
(32,36.5] 29 0.057 5.7
(36.5,41] 7 0.014 1.4
(41,45.5] 10 0.020 2.0
(45.5,50] 21 0.042 4.2
Total 506 1.000 100.0
From https://www.rengdatascience.io; by Alier Ëë Reng, 02/26/2019

### Citation

Reng (2019, Feb. 27). Reng Data Science Institute: Frequency Distribution Table with a gt Package. Retrieved from https://www.rengdatascience.io/posts/2019-02-27-a-frequency-distribution-with-a-gt-package/
@misc{reng2019frequency,
}