# Chapter 9 Frequencies

## 9.1 Intro

Frequency tables are normally used to inspect the distribution of categorical (dichotomous, nominal or ordinal) variables.

### 9.1.1 Example dataset

This example uses the Rosetta Stats example dataset “pp15” (see Chapter 1 for information about the datasets and Chapter 3 for an explanation of how to load datasets).

### 9.1.2 Variable(s)

From this dataset, this example uses variable currentEducation_cat.

## 9.3 Input: R

There are many packages that can be used to create a frequency table. We have only presented two examples. Other packages include (but are not limited to):

• summarytools

• Deducer

• janitor

• questionr

• sjmisc

• If you read an SPSS dataset into R, consider using the “frq” command from the “sjmisc” package. It presents both values and value labels (similar to SPSS output).

Note: To use the following commands, it is necessary to install and load the packages first (see section 2.2.2). The example dataset is stored under the name dat (see section 3).

### 9.3.1 rosetta package

Use the following command (this requires the rosetta package to be installed, see section 2.2.2, and the example dataset to be stored under name dat, see section 3):

rosetta::freq(dat$currentEducation_cat); To also order a barchart, use: rosetta::freq(dat$currentEducation_cat, plot=TRUE);

To order frequencies for multiple variables simultaneously, use:

rosetta::frequencies(dat$currentEducation_cat, dat$prevEducation_cat);

### 9.3.2 descr and kableExtra packages

The descr package is used to run the “descriptive statistics” for the variable (in this case, a frequency table). By default, the freq command in the descr package will also create a basic bar graph. The kableExtra package can be combined with many packages to create aesthetically pleasing tables.

kableExtra::kable_styling(
knitr::kable(as.data.frame(descr::freq(dat$currentEducation_cat)), booktabs=T, digits=2)); In words: 1. From the descr package, use the frequencies command for the currentEducation variable from the dataset: (descr::freq(dat$currentEducation_cat).
2. To make the aesthetically pleasing output, we are going to create a kable from the kableExtra package. Kables require dataframes, so we need to turn this frequency output into a dataframe: as.data.frame().
3. Now, let’s call the kable function from the knitr package: knitr::kable().
4. And add some stylistic elements, such as (what does booktabs=T actually do?) booktabs=T and changing the number of decimal places to 2 digits digits=2.
5. Lastly, let’s add kablestyling to make the kable aesthetically pleasing: kableExtra::kable_styling().

## 9.4 Input: SPSS GUI

First activate the dat dataset (see 2.3.1).

## 9.5 Input: SPSS Syntax

Use the following command (this requires the dat dataset to be the active dataset, see 2.3.1):

FREQ VARIABLES=currentEducation_cat.

To also order a barchart, use:

FREQ VARIABLES=currentEducation_cat
/BARCHART FREQ.

To order frequencies for multiple variables simultaneously, use:

FREQ VARIABLES=currentEducation_cat prevEducation_cat.

## 9.7 Output: R

### 9.7.1 rosetta package

##                    Frequencies Perc.Total Perc.Valid Cumulative
## Applied                    108       13.0       13.0       13.0
## Declined to answer           7        0.8        0.8       13.9
## Not studying               166       20.0       20.0       33.9
## Practical                   28        3.4        3.4       37.3
## Theoretical                520       62.7       62.7      100.0
## Total valid                829      100.0      100.0

## 9.9 Read more

If you would like more background on this topic, you can read more in these sources:

### References

Navarro, Danielle. 2018. Learning Statistics with R. 0.6 ed. New South Wales, Australia. https://learningstatisticswithr.com/.