# Chapter 4 Aggregating data: mean

## 4.1 Intro

A common task is aggregating multiple variables (columns in a dataset) into one new variable (column). For example, you may want to compute the average score on the items of a questionnaire.

Note that when creating new variable names, it is important to follow the convention for variable names (see section (software-basics-file-and-variable-name-conventions)).

### 4.1.1 Example dataset

This example uses the Rosetta Stats example dataset “pp15” (see Chapter 1 for information about the datasets and Chapter 3 for an explanation of how to load datasets).

### 4.1.2 Variable(s)

From this dataset, this example uses variables highDose_AttGeneral_good, highDose_AttGeneral_prettig, highDose_AttGeneral_slim, highDose_AttGeneral_gezond & highDose_AttGeneral_spannend.

We will aggregate these into the variable highDose_attitude (note that this variable already exists in the dataset, and that existing variable is also the mean of those five variables).

## 4.2 Input: jamovi

In the “Data” tab, click the “Compute” button as shown in Figure 4.1.

Type in the new variable name in the text field at the top, labelled “COMPUTED VARIABLE”. Then click the function button, marked $$f_x$$, select the MEAN function from the box labelled “Functions”, and double click all variables for which you want the mean in the box labelled “Variables”, while typing a comma in between each variable name as shown in Figure 4.1.

Alternatively, you can type the function name and list of variables directly without using the function ($$f_x$$) dialog as shown in Figure 4.2.

If you want to allow missing values, you can specify the ignore_missing=1 argument. In that case, you would type:

MEAN(highDose_AttGeneral_good, highDose_AttGeneral_prettig,
highDose_AttGeneral_slim, highDose_AttGeneral_gezond,
highDose_AttGeneral_spannend, ignore_missing=1)

It is as yet not possible to indicate the number of valid values that is required; either no missings are allowed at all, or any number of missing values is accepted.

## 4.3 Input: R

In R, there are roughly three approaches. Many analyses can be done with base R without installing additional packages. The rosetta package accompanies this book and aims to provide output similar to jamovi and SPSS with simple commands. Finally, the tidyverse is a popular collection of packages that try to work together consistently but implement a different underlying logic that base R (and so, the rosetta package).

dat$highdose_attitude <- rowMeans( dat[, c( 'highDose_AttGeneral_good', 'highDose_AttGeneral_prettig', 'highDose_AttGeneral_slim', 'highDose_AttGeneral_gezond', 'highDose_AttGeneral_spannend' ) ] ); ### 4.3.2 R: Rosetta dat$highdose_attitude <-
rosetta::means(
data = dat,
'highDose_AttGeneral_good',
'highDose_AttGeneral_prettig',
'highDose_AttGeneral_slim',
'highDose_AttGeneral_gezond',
'highDose_AttGeneral_spannend'
);

To indicate that a certain number of values must be valid (i.e. “non-missing”), the argument requiredValidValues can be passed. For example, to require four valid values (instead of requiring only one valid value, the default), use:

dat\$highdose_attitude <-
rosetta::means(
data = dat,
'highDose_AttGeneral_good',
'highDose_AttGeneral_prettig',
'highDose_AttGeneral_slim',
'highDose_AttGeneral_gezond',
'highDose_AttGeneral_spannend',
requiredValidValues = 4
);

## 4.4 Input: SPSS

For SPSS, there are two approaches: using the Graphical User Interface (GUI) or specify an analysis script, which in SPSS are called “syntax”.

### 4.4.1 SPSS: GUI

First activate the dat dataset (see 2.4.1).

### 4.4.2 SPSS: Syntax

COMPUTE highdose_attitude =
MEAN(
highDose_AttGeneral_good,
highDose_AttGeneral_prettig,
highDose_AttGeneral_slim,
highDose_AttGeneral_gezond,
highDose_AttGeneral_spannend
).

To indicate that a certain number of values must be valid (i.e. “non-missing”), the command MEAN can be appended with a period and the number of required valid values. For example, to required four valid values, use:

COMPUTE highdose_attitude =
MEAN.4(
highDose_AttGeneral_good,
highDose_AttGeneral_prettig,
highDose_AttGeneral_slim,
highDose_AttGeneral_gezond,
highDose_AttGeneral_spannend
).

## 4.5 Output

Aggregating variables is not an analysis, and as such, does not produce output. You can inspect the newly created variable to ensure it has been created properly.