Generalizing intra-individual dynamics

26 Feb 2016

^{Hamaker, E. L., & Grasman, R. P. P. P. (2014). To center or not to center? Investigating inertia with a multilevel autoregressive model. Frontiers in Psychology, 5, 1492. http://doi.org/10.3389/fpsyg.2014.01492}

Experience Sampling Method (ESM)

The Dynamics of Psychology

Psychological constructs can be conceptualized as dynamical systems, featuring complex emergent behavior:
- Correlated responses
- Stable "traits"
- Phase transitions
- Individual Differences
These systems can be portrayed as networks

Outline

Introduction Vector Auto-regression
Problem 1: \(n=1\) and limited observations
- Graphical VAR
Problem 2: \(n>1\)
- Multi-level VAR
Problem 3: Measurement error
- State-space models
Even more problems

Vector Auto-regression

Vector Auto-Regression (VAR)

Regress a vector of variables from a single subject on the previous time point
Assume multivariate normality
Assume errors are correlated
This leads to three things to estimate:
- A vector of intercepts
- A matrix encoding a temporal network
- A matrix encoding a contemporaneous network
Can be estimated using lm() or any least squares regression method

For a single subject: \[ \begin{aligned} \pmb{y}_t &= \pmb{\tau} + \pmb{B} \pmb{y}_{t-1} + \pmb{\varepsilon}_t \\ \pmb{\varepsilon}_t &\sim N\left( \pmb{0}, \pmb{\Theta} \right). \end{aligned} \] \(\pmb{B}\) encodes a directed temporal network and \(\pmb{\Theta}\) and undirected contemporaneous network

Vector Auto-Regression (VAR)

With \(\pmb{K} = \pmb{\Sigma}^{-1}\) encoding partial correlations coefficients

Problem 1: \(n=1\) and limited observations

Networks in Clinical Practice

Measure a patient over a short time, estimate network structures and use these in clinical practice
Naturally \(n=1\) problem
- Different estimation periods
- Different nodes
Naturally a limited data problem
- You can't measure a patient 10 times per day
- You can't measure a patient for months

Only model temporal effects between consecutive measurements
- Lag-1
Assume both the temporal and contemporaneous effects are sparse
- Only a relatively little number of edges in both networks
To do this, we use the graphical VAR model (Wild et al. 2010)
- Estimation via LASSO regularization, using extended BIC to select optimal tuning parameter (Rothman, Levina, and Zhu 2010; Abegaz and Wit 2013).
We implemented these methods in the R package graphicalVAR (cran.r-project.org/package=graphicalVAR)

Empirical Example

Data collected by Date C. Van der Veen, in collaboration with Harriette Riese en Renske Kroeze.

Patient suffering from panic disorder and depressive symptoms
- Perfectionist
Measured over a period of two weeks
Five times per day
Items were chosen after intake together with therapist

Feeling worthless interacts with feeling helpless

Feeling stressed interacts with feeling the need to do things

Central node: Feeling sad

Cycle of enjoyment, feeling sad, feeling worthless and being active

Having to had to do things leads to letting important things pass

Conclusion

Graphical VAR can be used to estimate network structures on limited data
Simulation study is still work in progress
- Simulated networks might not resemble clinical patient networks
Around 50 observations work well for 10 node networks

Problem 2: \(n>1\)

Multi-level VAR

Each subject is assumed to have their own temporal VAR model
- Contemporaneous model equal across people
VAR parameters come from distribution
- Fixed effect
- Random effect

Multi-level VAR

Adding superscript \(p\) for subject. Level 1 model: \[ \begin{aligned} \pmb{y}^{(p)}_t &= \pmb{\tau}^{(p)} + \pmb{B}^{(p)} \pmb{y}_{t-1} + \pmb{\varepsilon}^{(p)}_t \\ \pmb{\varepsilon}^{(p)}_t &\sim N\left( \pmb{0}, \pmb{\Theta} \right). \end{aligned} \]

Level 2 model: \[ \begin{bmatrix} \pmb{\tau}^{(p)} \\ \mathrm{Vec}\left(\pmb{B}^{(p)}\right) \end{bmatrix} \sim N\left( \pmb{\gamma}, \pmb{\Omega} \right). \] \(\pmb{\gamma}\) encodes fixed effects and \(\pmb{\Omega}\) the distribution of random effects.

Each parameter has a distribution

Individual networks

Random Effects

Fixed effects

Fixed Effects

Individual differences

Parameter Variance-covariance Matrix

Individual differences

Stability

Connectivity

Network changes with mean

Frequentist Estimation

Multi-variate multi-level regression estimation is complicated and not yet well implemented in open source software
lme4 packages implements univariate multi-level regression
- ^{Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01}
- lmer function
A multi-level VAR model can be estimated by sequentially estimating univariate models
- Estimate all incoming edges per node
- Used by Bringmann et al. (2013)

Frequentist Estimation

Sequential estimation:
- Needs to integrate out a high-dimensional distribution over parameters
- Only feasible for up to ~6 nodes
- Does not estimate all parameter covariances
  - Not all parameters together in the same model
Orthogonal estimation
- Alternatively, parameter covariances can be fixed to zero
- Fast, and works for high dimensions (e.g., 20 nodes)
- But, does not return any parameter correlation

Sequential Frequentist Estimation

Orthogonal Frequentist Estimation

Bayesian Estimation

Multi-level models can naturally be formulated as a Bayesian model
Estimable in open-source software such as OpenBUGS, Jags and Stan
Requires prior distributions to be specified
- Normal priors for fixed effects
- Inverse-Wishart priors for residual covariance structures and parameter covariances
Flat priors are preferred to model prior ignorance

Prior specification

Schuurman, Grasman and Hamaker (in press) show that an Inverse-Wishart prior can be highly informative and thus problematic
They propose a two-step procedure of specifying the scale parameter
- Fit model using lmer
- Obtain prior guess of the parameter standard deviations
- Put these in a diagonal matrix to use as scale matrix
The degrees of freedom can be set to the number of rows or columns in the covariance matrix

^{Schuurman, N. K., Grasman, R. P. P. P., & Hamaker, E.l. (in press). A Comparison of InverseWishart Prior Specifications for Covariance Matrices in Multilevel Autoregressive Models. Multivariate Behavioral Research.}

Bayesian

The number of correlations to estimate grows fast with the number of nodes in the network
- For \(P\) nodes, the number of random parameters equals \(P^4 + 2P^3 + P^2\)
While in theory estimable, practically high-dimensional models will take a long time
Again, models can be estimated sequentially or orthogonal

Sequential Bayesian Estimation

Orthogonal Bayesian Estimation

Simulation study

Problem 3: Measurement Error

Measurement error

So far, we have assumed no measurement error on the VAR process
However, in psychology measurement error is a dominant problem
Is within-person variability due to genuine within-person dynamics or simply due to white noise?

^{Schuurman, N. K., Houtveen, J. H., & Hamaker, E. L. (2015). Incorporating measurement error in n = 1 psychological autoregressive modeling. Frontiers in Psychology, 6, 1038. <.sup>}

State-space model

Schuurman, Houtveen and Hamaker (2015) suggest to to use Bayesian estimation of a white noise or state space model
In a state-space model, we assume a latent underlying VAR process measured through some measurement model plus (correlated) measurement error
Each observed can be represented by a latent or observed variables can be seen as indicators of a few latents (e.g., personality traits)
Added complexity: we need to explicitly model days

Adding superscript \(d\) for days. Level 1 model for the observed variables: \[ \begin{aligned} \pmb{y}^{(p,d)}_{t} &= \pmb{\tau} + \pmb{\Lambda} \pmb{\eta}_t^{(p,d)} + \pmb{\varepsilon}_{t}^{(p,d)} \\ \pmb{\varepsilon}_{t}^{(p,d)} &\sim N(\pmb{0}, \pmb{\Theta}) \end{aligned} \] \(\pmb{\Lambda}\) encodes the measurement model and \(\pmb{\Theta}\) now encodes the variance-covariance of the measurement error. At the latent level we model a VAR process: \[ \begin{aligned} \pmb{\eta}_t^{(p,d)} &= \pmb{\alpha}^{(p)} + \pmb{B}^{(p)} \pmb{\eta}_{t-1}^{(p,d)} + \pmb{\zeta}_t^{(p,d)} \\ \pmb{\zeta}_{t}^{(p,d)} &\sim N(\pmb{0}, \pmb{\Psi}). \end{aligned} \] \(\pmb{\Psi}\) encodes the contemporaneous relationships.

Level 2 model: \[ \begin{bmatrix} \pmb{\alpha}^{(p)} \\ \mathrm{Vec}\left(\pmb{B}^{(p)}\right) \end{bmatrix} \sim N\left( \pmb{\gamma}, \pmb{\Omega} \right). \] \(\pmb{\gamma}\) encodes fixed effects and \(\pmb{\Omega}\) the distribution of random effects.

Not shown: correlations between \(\left\{ \varepsilon_1, \varepsilon_2, \varepsilon_3 \right\}\) and \(\left\{ \varepsilon_4, \varepsilon_5, \varepsilon_6 \right\}\).

Two datasets
- Original: 26 subjects, 51 measurements on average, 1323 total observations
- Replication: 65 subjects, 35.5 measurements on average, 2309 total observations
16 indicators of neuroticism, extroversion, conscientiousness
Orthogonal Bayesian estimation (uncorrelated random effects)
Very preliminary results
- I ran the analysis yesterday!

Unsolved issues

Multi-level estimation of contemporaneous effects
High-dimensional estimation of random effect correlations
Day-processes
…

Software in development

mlVAR
- Multi-level vector autoregression
- Will contain state space soon
- https://github.com/SachaEpskamp/mlVAR
murmur
- Multivariate multi-level regression
- Also does VAR
- Contains Bayesian and lmer estimation
- https://github.com/SachaEpskamp/murmur

Thank you for your attention!

References

Abegaz, Fentaw, and Ernst Wit. 2013. “Sparse Time Series Chain Graphical Models for Reconstructing Genetic Networks.” Biostatistics. Biometrika Trust, kxt005.

Rothman, Adam J, Elizaveta Levina, and Ji Zhu. 2010. “Sparse Multivariate Regression with Covariance Estimation.” Journal of Computational and Graphical Statistics 19 (4). Taylor & Francis: 947–62.

Wild, Beate, Michael Eichler, Hans-Christoph Friederich, Mechthild Hartmann, Stephan Zipfel, and Wolfgang Herzog. 2010. “A Graphical Vector Autoregressive Modelling Approach to the Analysis of Electronic Diary Data.” BMC Medical Research Methodology 10 (1). BioMed Central Ltd: 28.