Chapter 4 Cognitive interviews

Cognitive interviews are a method to study cognitive validity. The ROCK supports coding of notes or transcripts from cognitive interviews. This chapter introduces this functionality, and also aims to provide some guidance when conducting cognitive interviews.

4.1 Cognitive validity

Cognitive validity refers to whether participants interpret the procedure and stimuli of a measurement instrument or manipulations the way they need to interpret them as intended.

Examples of manipulations are elements of behavior change interventions (e.g., the application of a behavior change principle such as modeling or persuasive communication), ingredients of psychotherapy (e.g. an approach to help people with reattribution or protocol elements designed to foster trust in the therapy-client relationship), or manipulations as used in psychological experimental studies (e.g. stimuli such as sound fragments that should induce a feeling of stress, a procedure designed to temporarily increase participants’ self-esteem, or a set-up designed to produce the smell of apple-pie).

Examples of measurement instruments are questionnaires (e.g., an experiential attitude scale designed to measure people’s feelings towards hand washing, a survey aiming to map out people’s perceptions of traffic safety, or personality indices) or response latency tasks (e.g. the implicit association test).

Both manipulations and measurement instruments consist of procedures and stimuli, and measurement instruments also specify how to register participants’ responses. For the manipulations and measurement instruments to be valid for a given population in a given context, usually a first requirement is that these constituent procedures, stimuli, and response registrations are interpreted as intended. Cognitive interviews are a method to study whether this is the case.

4.2 Response models

When thinking about participants’ interpretation of manipulations or measurement instruments, the concept of response models can be helpful. We define a response model here as the intended process leading to the registration of the participants’ responses and starting with their exposure to the stimuli and procedures that, together with the response registration procedure, constitute an item. An item is a distinct element of a measurement instrument or manipulation: these typically (but not necessarily) consist of multiple items. Although the logic underlying response models as means to specify what happens in between when somebody perceives stimuli and produces a response applies to manipulations as well as to measurement instruments, in the remainder of this chapter we’ll discuss measurement instruments as they explicitly contain a procedure for registering responses.

The response models describe how the construct that an item is designed to assess causes the variation in the item’s scores (if the item performs as it’s supposed to, i.e. if the item is valid; see Borsboom, Mellenbergh, and Heerden (2004) and Borsboom et al. (2009) for more background and how this approach contracts with Kane (2013)’s argument-based approach). It describes how more fundamental psychological processes are invoked, for example how the relevant reflective and/or reflexive, cognitive and/or affective, deliberate and/or automatic constructs, such as mechanisms, processes, or representations, ultimately produce the response that the item registers.

Note that the type of response model can differ as a function of one’s ontological and epistemological perspective. From more constructivist perspectives, the response model may involve shared construction of meaning; perspectives tending towards realism might lean more heavily on attention or memory processes; and if one entertains an operationalist perspective, response models might be exceedingly rudimentary (though admittedly, researchers with that perspective would probably not engage in cognitive interviews in the first place).

4.3 Response processes

Participants’ response processes are the description of what happens as they perceived and interpret, and process an item and ultimately produce the respons that is registered by the item’s response registration procedure. Ideally, these response processes closely reflect the item’s response model.

The form of a response process is typically quite different from the form of the response model. The former is often derived from participants’ verbal descriptions that express the results of introspective efforts; whereas the latter is often a description of the involved theoretical constructs and mechanisms (see the previous section). For example, the latter can contain automatic or unconscious processes, which would, almost literally by definition, be unavailable to introspection. Therefore, complete overlap between response processes and the desired response models may be impossible.

4.4 Selecting response model parts

Therefore, not all parts of the response models for all items can always be verified using cognitive interviews; other methods may have to be invoked, such as experiments where items are manipulated to verify parts of the item’s response model. This means you will have to first decide how each part of the response model can be verified, and then for the cognitive interviews, select those parts of each item’s response model that in fact lend themselves to verification with cognitive interviews.

Once you made this selection, you have a list of response model parts for each item. Often, the response models for a set of items that belong to the same measurement instrument will overlap. To illustrate this, we will give two examples of common situations. First, sometimes you assume that two item are so-called “parallel items.” These are items that you assume measure the exact same thing in the exact same way and assuming a reflecting measurement model. Such items are, for all practical purposes, interchangeable — and in such cases, the response models will be identical.

Second, sometimes you have two items that measure very different things. For example, one item is designed to measure somebody’s income, and a second item is designed to measure somebody’s education. These can both be part of a measurement instrument for social-economic status that assumes a formative measurement model. The response models for these two items will be quite different.

Once you made this selection, you should have the following:

  • for each item, a response model;
  • for each part of each response model, a decision as to whether you think it’s feasible to study whether those parts of people’s response processes are consistent with the corresponding parts of the item’s response model;
  • for each part of each response model, in which other response models it occurs.

You need one more piece of information before you can move towards specification of the coding schemes and the preparation of prompts: the response process spectrum.

4.5 The response process spectrum

For every part of each response model you want to verify, specify the corresponding response process spectrum. This is, as far as you can think of, a list of all possible alternatives to the response model. For example, if you response model contains a step “the person visualises the front wall of their house,” for example in an item measuring the number of windows in somebody’s house, a potential response process spectrum could be:

  • the person visualises the front wall of their house (i.e. the response model)
  • the person visualises the back wall of their house
  • the person visualises a side wall of their house
  • the person does not visualise anything

During the actual cognitive interview, you will likely discover that people’s response processes deviate from the response model in ways you couldn’t imagine beforehand, and that’s ok. The main purpose of this step is to help you get an idea of the kinds of things you’ll want to spot in each part of each response model.

Once you produced the response process spectrum for all response model parts for all (unique) response models, you can start compiling your coding scheme.

4.6 Coding schemes

Based on the response spectrums for each part of each response model, you can now produce codes that you will use to code the notes (or maybe transcripts) of your cognitive interviews. These codes will be the “glasses” through which you will see your results: although you can always add new codes during the coding phase, in general, it is easy to miss things for which you did not prepare a code.

For each part of each response model, think of a brief code that tells you how people’s response processes look. The coding scheme can be hierarchical: you can have “sub-codes” or “child codes” to organize your codes. For every code, designate a code identifier: a unique string of characters consisting only of lower case letters (a-z), upper case letters (A-Z), digits (0-9), and underscores (_), and always starting with a letter. If you have hierarchical codes, you can indicate the hierarchy using a greater-than sign (>).

Examples of valid codes are:

  • memory>recall
  • attention
  • forgot>early_childhood
  • forgot>late_childhood

Finally, once you have your coding scheme, you can craft your prompts.

4.7 Preparing prompts

Prompts are specific things you can ask participants during the cognitive interview designed to elicit expressions that are informative as to specific parts of their response processes. Designing these is relatively straightforward once you have your coding scheme: you have to think about which questions you can ask that are likely to lead to answers that you can then code with specific codes pertaining to specific parts of the response process.

Of course, if you manage to formulate prompts that can cover multiple parts of participants’ response processes, that’s more efficient. Therefore, asking open-ended questions (e.g., “why did you provide that answer?”) is a popular approach. However, sometimes closed questions can be very efficient to quickly check whether people did a given thing (e.g., “to arrive at this estimate, did you visualise the front wall of your house?”).

The product of this step will be a list of prompts, that you sort in the order in which the items will be presented to participants. You may want to designate unique identifiers to the prompts (e.g. numbers, letters, or a combination of these) to structure the notes you take during the cognitive interview, or even enter your notes into a file that already contains the prompts.

4.8 Common cognitive interview coding schemes

A number of commonly used cognitive interview coding schemes exist (some are listed below). Using these has advantages and drawbacks. A salient advantage is that if you use an existing coding scheme, you don’t have to map out the response models and response process spectrums for each item. This saves you a lot of time and potentially frustration and uncertainty if you don’t know much about the relevant response models. Another advantage is that using an existing coding scheme facilitates comparison of item performance (and so, measurement instrument performance) over different cognitive interviews studies.

A big disadvantage is that if you use an existing coding scheme, you don’t have to map out the response models and response process spectrums for each item. Those exercises force you to think long and hard about the assumptions underlying each item’s validity (and so, the validity of your measurement instrument), and skipping this make it more likely you miss problems. A second disadvantage is that it is harder to see how to improve the items, since the results of your cognitive interview will be very generic; they will not point to specific parts of the response models.

Whether you use existing coding schemes, develop your own, use existing schemes with more detailed codes added relating to your specific reponse models, or even have multiple coders code using different coding schemes is ultimately a subjective, scientific, and pragmatic consideration. Like for all decisions you take in any scientific endeavour, the most important thing is to clearly, comprehensively, and transparently document your decision and the underlying justification.

4.8.1 Existing coding schemes

allSchemes <- rock::codingSchemes_get_all();

for (i in names(allSchemes)) {
  cat("\n\n**", allSchemes[[i]]$label, "**\n\n",
      "\n`", names(allSchemes[[i]]$codingInstructions),
      "`\n  : ", allSchemes[[i]]$codingInstructions, "\n"

Peterson, Peterson & Powell

Is the item wording, terminology, and structure clear and easy to understand?
Has the respondent ever formed an attitude about the topic? Does the respondent have the necessary knowledge to answer the question? Are the mental calculations or long-term memory retrieval requirements too great?
Is the question too sensitive to yield an honest response? Is the question relevant to the respondent? Is the answer likely to be a constant?
Is the desired response available and/or accurately reflected in the response options? Are the response options clear?
Do all of the items combined adequately represent the construct? Are there items that do not belong?

Levine, Fowler & Brown

Items with unclear or ambigous terms, failed to understand the questions consistently.
Items for which respondents lacked information to answer a question.
Items measuring construct that are inapplicable for many respondents (e.g. made assumptions).
Items failed to measure the intended construct.
Items making discriminations that are too subtle for many respondents.
Several other general issues associated with the development of a questionnaire.

Willis, 1999

Problems with intent or meaning of a question.
Likely not to know or have trouble remembering information.
Problems with assumptions or underlying logic.
Problems with the response categories.
Sensitive nature or wording/bias.
Problems with introductions, instructions, or explanations.
Problems with lay-out or formatting.

See (Peterson, Peterson, and Powell 2017) and (Woolley, Bowen, and Bowen 2006)


Borsboom, Denny, Angélique O J Cramer, Rogier A. Kievit, Annemarie Zand Scholten, and Sanja Franić. 2009. “The End of Construct Validity.” In The Concept of Validity: Revisions, New Directions, and Applications, 135–70. IAP Information Age Publishing.
Borsboom, Denny, Gideon J. Mellenbergh, and Jaap van Heerden. 2004. “The Concept of Validity.” Psychological Review 111 (4): 1061–71.
Kane, Michael. 2013. “The Argument-Based Approach to Validation.” Edited by Matthew Burns. School Psychology Review 42 (4): 448–57.
Peterson, Christina Hamme, N. Andrew Peterson, and Kristen Gilmore Powell. 2017. “Cognitive Interviewing for Item Development: Validity Evidence Based on Content and Response Processes.” Measurement and Evaluation in Counseling and Development 50 (4): 217–23.
Woolley, Michael E., Gary L. Bowen, and Natasha K. Bowen. 2006. “The Development and Evaluation of Procedures to Assess Child Self-Report Item Validity Educational and Psychological Measurement.” Educational and Psychological Measurement 66 (4): 687–700.