--- title: "Recurrence Quantification Analysis with crqa — a guided tour of v2.1.0" output: html_vignette: toc: true toc_depth: 3 vignette: > %\VignetteIndexEntry{Recurrence Quantification Analysis with crqa — a guided tour of v2.1.0} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = TRUE ) library(crqa) ``` ## Overview `crqa` (Coco & Dale 2014; Coco et al. 2021) is the primary R package for **recurrence quantification analysis (RQA)** on behavioural, physiological, and linguistic time series. It provides: - Auto-recurrence (`method = "rqa"`), cross-recurrence (`"crqa"`), and multidimensional cross-recurrence (`"mdcrqa"`) with the full set of standard RQA measures. - Diagonal recurrence profiles (`drpfromts`) for leader–follower analysis. - Windowed and piecewise analyses (`wincrqa`, `windowdrp`, `piecewiseRQA`). - Parameter optimisation (`optimizeParam`). - Visualisation (`plot_rp`). ### Performance in context (v2.1.0) Version 2.1.0 replaces the legacy `spdiags`-based pipeline with a **fused** Rcpp inner loop — meaning distance computation, threshold comparison, and line-statistic collection are merged into a single pass that never materialises the N×N distance matrix — and adds OpenMP parallelism on top. Figure 1 plots wall time vs. N on the Rössler-system benchmark of Marwan & Kraemer (2023), the standard reference for RQA package comparisons (m = 3, τ = 6, Euclidean norm, dense ~2 % recurrence). ```{r benchmark-fig, echo = FALSE, out.width = "100%", fig.cap = "Figure 1. Wall time per call for crqa v2.1.0 vs. other RQA implementations on the Rössler-system benchmark (Marwan & Kraemer 2023). Filled markers: measured on this hardware (4-core Ryzen-class CPU). Open markers: reference levels from Marwan & Kraemer 2023 Fig 3A. crqa v2.1.0 uses the fused C++ kernel with OpenMP enabled (default); thread count = all available cores. AccRQA 1.0.1 (Adámek et al. 2026) measured single-threaded — their library also offers OpenMP and CUDA paths not benchmarked here. The aRQA path is a fundamentally different algorithm (phase-space histogram) with O(N) memory and scales to N = 200 000 in seconds."} knitr::include_graphics("figure3_speed_memory.png") ``` The figure summarises crqa's competitive position: - **`crqa v2.0.7 (legacy)`** scales as O(N²) in both time and memory and runs out of memory at N ≈ 12 k on typical hardware. - **`crqa v2.1.0 (fused + OpenMP)`** — the current release — is 5–15× faster than v2.0.7 and comparable to pyunicorn (Cython, Python) on CPU. The fused C++ kernel parallelises the distance-and-threshold loop via OpenMP (default: all available cores). At moderate N the OpenMP gain over serial is ~1.3–1.9× on 4 cores; typical sparser analyses (RR ≲ 0.5 %) approach near-linear scaling with cores. - **`crqa v2.1.0 (method = "aRQA")`** is part of this package and scales to N = 200 000 in seconds with O(N) memory — a regime where all exact O(N²) methods run out of memory. - **AccRQA 1.0.1 (CPU, Adámek et al. 2026)** is a high-performance C/C++ library with R, Python, and C/C++ bindings that uses novel OpenMP and CUDA algorithms. Measured single-threaded on this hardware it is ~3–6× faster than crqa v2.1.0 on this auto-RQA benchmark. It is worth noting that the AccRQA paper (Adámek et al. 2026) benchmarked against crqa ≤ 2.0.7 (the legacy O(N²) path); crqa v2.1.0's fused kernel substantially narrows that gap. AccRQA does not implement cross-recurrence (CRQA), multidimensional (mdcrqa), windowed analysis, categorical data, Theiler windows, parameter optimisation, or visualisation, which remain exclusive to crqa. The two libraries are complementary: AccRQA is a GPU-aware kernel library for vanilla auto-RQA on long single time series; crqa is the R-native analytical toolbox for the full range of recurrence analyses used in behavioural, cognitive and linguistic research. By default, crqa uses **all available CPU cores** transparently — you do not need to configure anything. The only situation in which this matters is if you are running many `crqa()` calls in parallel yourself (for example via the `workers` argument of `wincrqa()`), in which case see the "Parallelism: advanced control" section below. --- ## Installation ```{r install, eval = FALSE} install.packages("crqa") # CRAN release # devtools::install_github("morenococo/crqa") # development version ``` --- ## Core concepts and the `crqa()` function All analyses go through `crqa()` or one of its wrappers. The key arguments: | Argument | Role | Default | |---|---|---| | `ts1`, `ts2` | Time series (vectors or matrices) | — | | `delay` | Embedding delay τ | 1 | | `embed` | Embedding dimension m | 1 | | `radius` | Recurrence threshold ε | 0.001 | | `rescale` | Rescale distance matrix (0–4) | 0 | | `normalize` | Normalize series (0 = none, 1 = unit, 2 = z) | 0 | | `tw` | Theiler window | 0 | | `mindiagline` | Min diagonal-line length | 2 | | `minvertline` | Min vertical-line length | 2 | | `side` | `"both"`, `"upper"`, or `"lower"` | `"both"` | | `method` | `"rqa"`, `"crqa"`, `"mdcrqa"`, `"aRQA"` | `"rqa"` | | `whiteline` | Compute white-line statistics? | `FALSE` | | `rr_denom` | RR denominator: `"full"` (v2.0.7 compat.) or `"valid"` | `"full"` | | `workers` | Parallel workers (wrappers only) | 1 | ### Return value `crqa()` returns a named list: - `RR` — recurrence rate (%) - `DET` — determinism (%) - `NRLINE` — number of diagonal lines ≥ `mindiagline` - `maxL` — longest diagonal line - `L` — mean diagonal-line length - `ENTR` — entropy of the diagonal-line length distribution - `rENTR` — normalised entropy - `LAM` — laminarity (%) - `TT` — trapping time (mean vertical-line length) - `max_vertlength` — longest vertical line - `wmean`, `wmax`, `wENTR` — white-line stats (if `whiteline = TRUE`) - `RP` — the recurrence plot as a sparse matrix (`Matrix` class) --- ## Example 1 — Auto-recurrence on categorical data A nursery rhyme ("The Wheels on the Bus") encoded as a word vector. With categorical data set `radius` small so only identical tokens recur. ```{r rqa-cat} data(crqa) ans_rqa <- crqa(text, text, delay = 1, embed = 1, rescale = 0, radius = 0.0001, normalize = 0, mindiagline = 2, minvertline = 2, tw = 1, whiteline = FALSE, recpt = FALSE, side = "both", method = "rqa", metric = "euclidean", datatype = "categorical") # Print all scalar measures (exclude the RP itself) print(ans_rqa[!names(ans_rqa) %in% "RP"]) ``` ### Visualising the recurrence plot ```{r rqa-plot, fig.width = 5, fig.height = 5, out.width = "60%"} plot_rp(ans_rqa$RP, title = "Auto-RP: Wheels on the Bus", xlabel = "Word index", ylabel = "Word index") ``` --- ## Example 2 — Cross-recurrence on categorical data Eye-tracking data: narrator and listener looking at six screen locations during a joint description task. Cross-recurrence quantifies how temporally coupled their gaze patterns are. ```{r crqa-cat} listener <- eyemovement$listener narrator <- eyemovement$narrator ans_crqa <- crqa(narrator, listener, delay = 1, embed = 1, rescale = 0, radius = 0.01, normalize = 0, mindiagline = 2, minvertline = 2, tw = 0, whiteline = FALSE, recpt = FALSE, side = "both", method = "crqa", metric = "euclidean", datatype = "categorical") cat(sprintf("RR = %.2f%% DET = %.2f%% LAM = %.2f%%\n", ans_crqa$RR, ans_crqa$DET, ans_crqa$LAM)) ``` ### White-line statistics (new in v2.1.0) White vertical lines are gaps between recurrent stretches in the same column. Their distribution approximates the inter-recurrence time distribution; `wENTR` is the recurrence-time entropy (Faure & Korn 2004). Enabling `whiteline = TRUE` costs essentially nothing — the scan runs in the same sparse-index pass as the black-line statistics. ```{r whiteline} ans_wl <- crqa(narrator, narrator, delay = 1, embed = 1, rescale = 0, radius = 0.001, normalize = 0, mindiagline = 2, minvertline = 2, tw = 0, whiteline = TRUE, recpt = FALSE, side = "both", method = "rqa", metric = "euclidean", datatype = "categorical") cat(sprintf("wmean = %.2f wmax = %d wENTR = %.3f\n", ans_wl$wmean, ans_wl$wmax, ans_wl$wENTR)) ``` --- ## Example 3 — Multidimensional cross-recurrence (mdCRQA) Hand-movement data from a joint LEGO construction task: two participants' wrist positions in two spatial dimensions. ```{r mdcrqa} P1 <- cbind(handmovement$P1_TT_d, handmovement$P1_TT_n) P2 <- cbind(handmovement$P2_TT_d, handmovement$P2_TT_n) ans_md <- crqa(P1, P2, delay = 5, embed = 2, rescale = 0, radius = 0.3, normalize = 0, mindiagline = 2, minvertline = 2, tw = 0, whiteline = FALSE, recpt = FALSE, side = "both", method = "mdcrqa", metric = "euclidean", datatype = "continuous") cat(sprintf("RR = %.2f%% DET = %.2f%% LAM = %.2f%%\n", ans_md$RR, ans_md$DET, ans_md$LAM)) ``` --- ## Example 4 — Diagonal recurrence profile (leader–follower) `drpfromts()` extracts the diagonal cross-recurrence profile: for each lag δ, it reports the mean recurrence along the diagonal at that offset. The peak lag identifies the temporal lead of one series over the other. ```{r drp, fig.width = 6, fig.height = 4, out.width = "75%"} res <- drpfromts(narrator, listener, windowsize = 100, radius = 0.001, delay = 1, embed = 1, rescale = 0, normalize = 0, mindiagline = 2, minvertline = 2, tw = 0, whiteline = FALSE, recpt = FALSE, side = "both", method = "crqa", metric = "euclidean", datatype = "categorical") timecourse <- seq_along(res$profile) - ceiling(length(res$profile) / 2) plot(timecourse, res$profile * 100, type = "l", lwd = 2, xlab = "Lag (samples)", ylab = "Recurrence rate (%)", main = "Diagonal cross-recurrence profile") abline(v = res$maxlag - ceiling(length(res$profile) / 2), col = "red", lty = 2) ``` --- ## Example 5 — Windowed RQA (time-varying dynamics) `wincrqa()` slides a window over the time series and returns one row of RQA measures per window — useful for tracking how coupling evolves. ```{r wincrqa, fig.width = 6, fig.height = 4, out.width = "75%"} win_res <- wincrqa(narrator, listener, windowstep = 20, windowsize = 80, delay = 1, embed = 1, radius = 0.001, rescale = 0, normalize = 0, mindiagline = 2, minvertline = 2, tw = 0, whiteline = FALSE, recpt = FALSE, side = "both", method = "crqa", metric = "euclidean", datatype = "categorical", trend = FALSE, workers = 1L) plot(win_res$win, win_res$RR, type = "l", lwd = 2, xlab = "Window", ylab = "RR (%)", main = "Windowed RR (narrator-listener gaze coupling)") ``` For very long series with many windows, parallel execution is available via `workers > 1`: ```{r wincrqa-parallel, eval = FALSE} win_par <- wincrqa(narrator, listener, windowstep = 20, windowsize = 80, delay = 1, embed = 1, radius = 0.001, workers = 4L) # spread windows across 4 CPU cores ``` Worker spawn has a fixed overhead of a few seconds; use `workers > 1` only when individual windows are themselves expensive (long series or many windows). See the "Parallelism: advanced control" section below for the one configuration step needed when combining `workers > 1` with crqa's built-in OpenMP parallelism. --- ## Example 6 — Approximative RQA for very long series (new in v2.1.0) For N > ~10 000, the exact methods run out of memory (O(N²) peak RAM). `method = "aRQA"` implements the phase-space histogram approach of Schultz, Spiegel, Marwan & Albayrak (2015, *Phys. Lett. A* 379:997) and runs in O(N) memory. The method was developed and validated in the literature for **auto-recurrence** on a single series (see example below); the implementation also accepts two distinct series (cross-RQA) and matrix inputs (multidimensional), but the approximation has only been benchmarked for the auto-RQA case. It is available directly via `crqa()`: ```{r arqa} # Simulate 5000 points of AR(1) data as a stand-in for a long series set.seed(42) x <- as.numeric(arima.sim(model = list(ar = 0.7), n = 5000)) res_arqa <- crqa(x, x, delay = 1, embed = 3, radius = 0.5, normalize = 2, mindiagline = 2, method = "aRQA") cat(sprintf("aRQA: RR = %.3f%% DET = %.2f%% L = %.2f\n", res_arqa$RR, res_arqa$DET, res_arqa$L)) ``` **When to use aRQA:** for stochastic or weakly-deterministic data with N > 10 000 where exact computation is infeasible. The approximation error on DET is a few percentage points for stochastic data and larger for strongly deterministic systems (see the help page `?aRQA` for details). For comparison at moderate N: ```{r arqa-compare, eval = FALSE} # Exact crqa for the same series at N = 1000 (comparable parameter set) x_short <- x[1:1000] res_exact <- crqa(x_short, x_short, delay = 1, embed = 3, radius = 0.5, normalize = 2, mindiagline = 2, method = "rqa") res_approx <- crqa(x_short, x_short, delay = 1, embed = 3, radius = 0.5, normalize = 2, mindiagline = 2, method = "aRQA") cat(sprintf("Exact: RR = %.3f%% DET = %.2f%%\n", res_exact$RR, res_exact$DET)) cat(sprintf("aRQA: RR = %.3f%% DET = %.2f%%\n", res_approx$RR, res_approx$DET)) ``` --- ## Example 7 — Parameter optimisation `optimizeParam()` uses average mutual information (AMI) for delay, false nearest neighbours (FNN) for embedding dimension, and binary search for the radius that yields a target recurrence rate. ```{r optim, eval = FALSE} data(crqa) P1 <- cbind(handmovement$P1_TT_d, handmovement$P1_TT_n) P2 <- cbind(handmovement$P2_TT_d, handmovement$P2_TT_n) par <- list(method = "mdcrqa", metric = "euclidean", maxlag = 20, radiusspan = 100, normalize = 0, rescale = 4, mindiagline = 10, minvertline = 10, tw = 0, whiteline = FALSE, recpt = FALSE, side = "both", datatype = "continuous", fnnpercent = NA, typeami = NA, nbins = 50, criterion = "firstBelow", threshold = 1, maxEmb = 20, numSamples = 500, Rtol = 10, Atol = 2) results <- optimizeParam(P1, P2, par, min.rec = 2, max.rec = 5) print(unlist(results)) ``` --- ## Note on the `rr_denom` argument (v2.1.0) Version 2.1.0 introduces the `rr_denom` argument with two options: - `"full"` (default): the denominator is N×M regardless of blanking. Reproduces the historical convention (Marwan et al. 2007, *Phys. Rep.* 438:237) and the behaviour of crqa ≤ 2.0.7. This is the default so that upgrading from v2.0.7 does not silently change any user's results. - `"valid"`: the denominator counts only cells that survived the Theiler window and side mask — the fraction of *analysable* pairs that are recurrent. Internally consistent with how DET and ENTR are computed. Recommended when reporting RR alongside DET/ENTR or when comparing analyses that use different Theiler windows. When `tw == 0` and `side == "both"` (the most common case) both conventions give identical results. ```{r rr-denom} r_valid <- crqa(narrator, listener, delay = 1, embed = 1, rescale = 0, radius = 0.001, normalize = 0, mindiagline = 2, minvertline = 2, tw = 2, side = "both", method = "crqa", metric = "euclidean", datatype = "categorical", rr_denom = "valid") r_full <- crqa(narrator, listener, delay = 1, embed = 1, rescale = 0, radius = 0.001, normalize = 0, mindiagline = 2, minvertline = 2, tw = 2, side = "both", method = "crqa", metric = "euclidean", datatype = "categorical", rr_denom = "full") cat(sprintf("rr_denom='valid': RR = %.4f%%\n", r_valid$RR)) cat(sprintf("rr_denom='full': RR = %.4f%% (v2.0.7 convention)\n", r_full$RR)) ``` --- ## Parallelism: advanced control Most users can ignore this section — by default, `crqa()` uses all available CPU cores and that is the right choice almost all the time. There is exactly **one** situation where you may need to take action: when you are spawning multiple R workers yourself (e.g. via `wincrqa(..., workers = 4)`) and each worker is also calling `crqa()`. In that case the inner OpenMP parallelism multiplies with the outer worker count and can overload the CPU. To avoid this, set `OMP_NUM_THREADS=1` in the shell *before* launching R, so each worker runs serially and the parallelism comes entirely from `workers`: ```bash $ OMP_NUM_THREADS=1 R # then run wincrqa(..., workers = 4) inside R ``` Setting it via `Sys.setenv()` from inside R is too late on most systems — the OpenMP thread count is cached the first time `crqa()` is called. To run crqa strictly single-threaded for reproducibility, use the same mechanism: launch R with `OMP_NUM_THREADS=1`. --- ## References Adámek, K., Novotný, J., Marwan, N., & Pánis, R. (2026). AccRQA library: accelerating recurrence quantification analysis. *European Physical Journal Special Topics*. https://doi.org/10.1140/epjs/s11734-026-02338-3 Coco, M. I., & Dale, R. (2014). Cross-recurrence quantification analysis of categorical and continuous time series: An R package. *Frontiers in Psychology*, **5**, 510. Coco, M. I., et al. (2021). crqa: Recurrence quantification analysis for categorical and continuous time series in R. *The R Journal*, **13**, 83–97. Faure, P., & Korn, H. (2004). A new method to estimate the Kolmogorov entropy from recurrence plots. *Physics Letters A*, **332**, 329–339. Marwan, N., Romano, M. C., Thiel, M., & Kurths, J. (2007). Recurrence plots for the analysis of complex systems. *Physics Reports*, **438**, 237–329. Marwan, N., & Kraemer, K. H. (2023). Trends in recurrence analysis of dynamical systems. *European Physical Journal Special Topics*, **232**, 5–27. Schultz, D., Spiegel, S., Marwan, N., & Albayrak, S. (2015). Approximation of diagonal line based measures in recurrence quantification analysis. *Physics Letters A*, **379**, 997–1011. Wallot, S. (2018). Multidimensional cross-recurrence quantification analysis (MdCRQA). *Multivariate Behavioral Research*, 1–19.