Chapter 5 Aggregating data: sum
A common task is aggregating multiple variables (columns in a dataset) into one new variable (column). For example, you may want to compute the sum of the items of a questionnaire.
Note that when creating new variable names, it is important to follow the convention for variable names (see section (software-basics-file-and-variable-name-conventions)).
5.1.1 Example dataset
This example uses the Rosetta Stats example dataset “pp15” (see Chapter 1 for information about the datasets and Chapter 3 for an explanation of how to load datasets).
5.2 Input: jamovi
In the “Data” tab, click the “Compute” button as shown in Figure 5.1.
Type in the new variable name in the text field at the top, labelled “COMPUTED VARIABLE”. Then click the function button, marked \(f_x\), select the SUM function from the box labelled “Functions”, and double click all variables for which you want the sum in the box labelled “Variables”, while typing a comma in between each variable name as shown in Figure 5.2.
Alternatively, you can type the function name and list of variables directly without using the function (\(f_x\)) dialog as shown in Figure 5.3.
5.3 Input: R
In R, there are roughly three approaches. Many analyses can be done with base R without installing additional packages. The
rosetta package accompanies this book and aims to provide output similar to jamovi and SPSS with simple commands. Finally, the tidyverse is a popular collection of packages that try to work together consistently but implement a different underlying logic that base R (and so, the
5.3.1 R: base R
$highdose_attitude <- datcolSums( dat[ ,c( 'highDose_AttGeneral_good', 'highDose_AttGeneral_prettig', 'highDose_AttGeneral_slim', 'highDose_AttGeneral_gezond', 'highDose_AttGeneral_spannend' ) ] );
5.4 Input: SPSS
For SPSS, there are two approaches: using the Graphical User Interface (GUI) or specify an analysis script, which in SPSS are called “syntax”.
5.4.1 SPSS: GUI
First activate the
dat dataset (see 2.4.1).