Chapter 12 The rock
R package
The rock
R package implements the ROCK standard for qualitative data analysis. It is an extension to R, a program that was originally a statistical programming language. R is not only open source, but also has a flexible infastructure allowing easy extension with user-contributed packages. Therefore, R is quickly becoming a multipurpose scientific toolkit, and one of its tools is the rock
package.
When using R, most people use RStudio, a so-called integrated development environment. It has many features that make using R much more userfriendly and efficient. In this book, where we refer to using R, we actually mean using R through RStudio. Both R and RStudio are Free/Libre Open Source Software (FLOSS) solutions. This means that they are free to download and install in perpetuity.
12.1 Downloading and installing R and RStudio
Because RStudio makes using R considerably more userfriendly (and pretty), in this book, we will always use R through RStudio. Therefore, throughout this book, when we refer to R, we actually mean using R through RStudio.
R can be downloaded from https://cloud.r-project.org/:4 click the “Download R for …” link that matches your operating system, and follow the instructions to download the right version. You don’t have to start R - it just needs to be installed on your system. RStudio will normally find it on its own.
RStudio can be downloaded from https://www.rstudio.com/products/rstudio/download/. Once it is installed, you can start it, in which case you should see something similar to what is shown in Figure 12.1.5
R itself lives in the bottom-left pane, the console. Here, you can interact directly with R. You can open R scripts in the top-left pane: these are text files with the commands you want R to execute. The top-right pane contains the Environment tab, which shows all loaded datasets and variables; the History tab, which shows the commands you used; and the Connections and Build tabs, which you will not need. The bottom-right pane contains a Files tab, showing files on your computer; a Plots tab, which shows plots you created; a Packages tab, which shows the packages you have installed; a Help tab, which shows help ages about specific functions; and a Viewer tab, which can show HTML content that was generated in R.
12.2 Downloading and installing the rock
package
The rock
package can be installed by going to the console (bottom-left tab) and typing:
install.packages("rock");
This will connect to the Comprehensive R Archive Network (CRAN) and download and install the rock
package. If you feel adventurous, you can instead install the one of the two development versions. One is the most current production version, and the other is the development (‘dev’) version. The most current production version will generally be as stable as versions on CRAN, and will contain more features. This version will contain all features discussed in this book. The dev version contains work on new features. This also means, however, that it may contain bugs.
To conveniently install the most recent production and dev versions, another package exists called remotes
. You can install this using this command:
install.packages("remotes");
Then, to install the most up-to-date production version, use:
::install_gitlab("r-packages/rock"); remotes
And to install the current dev version, use:
::install_gitlab("r-packages/rock@dev"); remotes
More information about the rock
package can be found at its so-called pkgdown website, which is located at http://r-packages.gitlab.io/rock.
12.3 rock
functions
12.3.1 clean_source
and clean_sources
Sometimes, sources are a bit messy.6 In such cases, it can be efficient to preprocess them and perform some search and replace actions. This can be done for one or multiple source files using clean_source
(for one file) and clean_sources
(for multiple files; it basically just calls clean_transcript
for multiple files).
For example, a researcher will often want every sentence, as transcribed, to be on its own line (as lines correspond to utterances). In fact, this is the basic function of the clean_source
function: by default, if used without other arguments, they try to (more or less smartly) split a transcript such that each transcribed sentence (as marked by a period (.
), a question mark (?
), an exclamation mark (!
), or an ellipsis (…
)) ends up on its own line. Before doing this, clean_source
replaces all occurrences of exactly consecutive periods (..
) with one period, all occurrences of four or more consecutive periods with three periods, and all occurrences of three or more newlines (\n
) with two newlines.
But this function can also be used to perform additional (or other) replacements. For example, imagine that a transcriber used a dash at the beginning of a line, followed by a space, to indicate when a person starts talking, like this:
- Something said by one speaker
- Something said by another speaker
To easily group all utterances by the same person together, it would be convenient if this was expressed in the source file in a way that fits with ROCK’s conventions. That sequence of characters (actually a newline character (\n
) followed by a dash (-
) followed by a whitespace character (\s
)) can be converted into section break ‘---turn-of-talk---
’ with this command:
::clean_source(
rockinput = "
- Something said by one speaker
- Something said by another speaker
",
replacementsPre = list(c("\\n-\\s", "\n---turn-of-talk---\n")));
This will change those that bit of transcript into:
---turn-of-talk---
Something said by one speaker
---turn-of-talk---
Something said by another speaker
(You can copy-paste the command above into R and test this, assuming you have the rock
package installed. Note that by default, R doesn’t print newline characters as newline characters. To show newline characters as newlines, wrap the command in the cat
command.)
To also maintain the default replacements, more can be added by specifying them in argument extraReplacements
instead of replacementsPre
(or replacementsPost
). For clean_source
, as the first argument (input
), either a character vector (like in the example above) or a path to a file can be specified, in which case the files contents will be read. If the second argument (outputFile
) is specified, the result is saved to that file; if not, it is returned (and printed by R).
12.3.1.1 A word of caution
If you use this function to clean one or more transcripts, make sure that whenever you edit the outputFile
, you save it under another name! Otherwise, rerunning the script to clean the transcripts will overwrite your edits. By default, the rock
option “preventOverwriting
” is set to TRUE
, so by default, if a file already exists on disk, it is never overwritten. You can change this behavior for one function by specifying preventOverwriting=FALSE
as a function argument. You can also change this for all functions by changing the option, with the following command:
::opts$set(preventOverwriting=FALSE); rock
12.4 rock
options and defaults
Although the behavior of the rock
functions is mostly controlled by specifying the relevant arguments, the rock
package also has many options that you can just specify once and that will then be used by all rock
functions. This saves you from repeating the same arguments every time you call the rock
functions.
Almost all of these options have default settings, many of which implement the ROCK standard, but some relating to project-level settings. We’ll first discuss the project-level options, and then the options that implement the ROCK standard.