Chapter 4 Aggregating data: mean
4.1 Intro
A common task is aggregating multiple variables (columns in a dataset) into one new variable (column). For example, you may want to compute the average score on the items of a questionnaire.
Note that when creating new variable names, it is important to follow the convention for variable names (see section (software-basics-file-and-variable-name-conventions)).
4.1.1 Example dataset
This example uses the Rosetta Stats example dataset “pp15” (see Chapter 1 for information about the datasets and Chapter 3 for an explanation of how to load datasets).
4.1.2 Variable(s)
From this dataset, this example uses variables highDose_AttGeneral_good
, highDose_AttGeneral_prettig
, highDose_AttGeneral_slim
, highDose_AttGeneral_gezond
& highDose_AttGeneral_spannend
.
We will aggregate these into the variable highDose_attitude
(note that this variable already exists in the dataset, and that existing variable is also the mean of those five variables).
4.2 Input: jamovi
In the “Data” tab, click the “Compute” button as shown in Figure 4.1.

Figure 4.1: Aggregating in jamovi: opening Compute menu
Type in the new variable name in the text field at the top, labelled “COMPUTED VARIABLE”. Then click the function button, marked \(f_x\), select the MEAN function from the box labelled “Functions”, and double click all variables for which you want the mean in the box labelled “Variables”, while typing a comma in between each variable name as shown in Figure 4.1.

Figure 4.2: Aggregating in jamovi: using the function menu to specify a computation
Alternatively, you can type the function name and list of variables directly without using the function (\(f_x\)) dialog as shown in Figure 4.2.

Figure 4.3: Aggregating in jamovi: directly typing in a computation
If you want to allow missing values, you can specify the ignore_missing=1
argument. In that case, you would type:
MEAN(highDose_AttGeneral_good, highDose_AttGeneral_prettig,
highDose_AttGeneral_slim, highDose_AttGeneral_gezond,ignore_missing=1) highDose_AttGeneral_spannend,
It is as yet not possible to indicate the number of valid values that is required; either no missings are allowed at all, or any number of missing values is accepted.
4.3 Input: R
In R, there are roughly three approaches. Many analyses can be done with base R without installing additional packages. The rosetta
package accompanies this book and aims to provide output similar to jamovi and SPSS with simple commands. Finally, the tidyverse is a popular collection of packages that try to work together consistently but implement a different underlying logic that base R (and so, the rosetta
package).
4.3.1 R: base R
$highdose_attitude <-
datrowMeans(
dat[,c(
'highDose_AttGeneral_good',
'highDose_AttGeneral_prettig',
'highDose_AttGeneral_slim',
'highDose_AttGeneral_gezond',
'highDose_AttGeneral_spannend'
)
] );
4.3.2 R: Rosetta
$highdose_attitude <-
dat::means(
rosettadata = dat,
'highDose_AttGeneral_good',
'highDose_AttGeneral_prettig',
'highDose_AttGeneral_slim',
'highDose_AttGeneral_gezond',
'highDose_AttGeneral_spannend'
);
To indicate that a certain number of values must be valid (i.e. “non-missing”), the argument requiredValidValues
can be passed. For example, to require four valid values (instead of requiring only one valid value, the default), use:
$highdose_attitude <-
dat::means(
rosettadata = dat,
'highDose_AttGeneral_good',
'highDose_AttGeneral_prettig',
'highDose_AttGeneral_slim',
'highDose_AttGeneral_gezond',
'highDose_AttGeneral_spannend',
requiredValidValues = 4
);
4.4 Input: SPSS
For SPSS, there are two approaches: using the Graphical User Interface (GUI) or specify an analysis script, which in SPSS are called “syntax”.
4.4.2 SPSS: Syntax
COMPUTE highdose_attitude =
MEAN(
highDose_AttGeneral_good,
highDose_AttGeneral_prettig,
highDose_AttGeneral_slim,
highDose_AttGeneral_gezond,
highDose_AttGeneral_spannend
).
To indicate that a certain number of values must be valid (i.e. “non-missing”), the command MEAN
can be appended with a period and the number of required valid values. For example, to required four valid values, use:
COMPUTE highdose_attitude =
MEAN.4(
highDose_AttGeneral_good,
highDose_AttGeneral_prettig,
highDose_AttGeneral_slim,
highDose_AttGeneral_gezond,
highDose_AttGeneral_spannend
).