Local independence

  • Are "Get angry easily" and "Get irritated easily" locally independent given Neuroticism?
  • Are "Don't talk a lot" and "Find it difficult to approach others" locally independent given Extroversion?
  • Are "Am indifferent to the feelings of others" and "Inquire about others' well-being" locally independent given Agreeableness?
  • Are "Avoid difficult reading material" and "Will not probe deeply into a subject" locally independent given Openness to Experience?
  • Are "Do things in a half-way manner" and "Waste my time" locally independent given Conscientiousness?

Network Psychometrics Ecosystem

Outline

  • Part 1: Markov Random Fields
    • Gaussian Graphical Model
    • Ising Model
    • LASSO regularization
  • Part 2: Residual Interaction Modeling
    • Combining factor analysis and network modeling
    • rim demonstration
  • Part 3: Longitudinal analysis
    • Single person: GraphicalVAR
    • Multiple persons: mlVAR

Markov Random Fields

Networks

  • A network is a set of nodes connected by a set of edges
  • Nodes represent variables
  • Edges can be directed or undirected and represent interactions
  • Color and width indicate the strength and sign of an edge (Epskamp et al. 2012)

Directed network

Undirected network

Weighted network

Markov Random Fields

  • A pairwise Markov Random Field (MRF) is an undirected network
  • Two nodes are connected if they are not independent conditional on all other nodes.
  • More importantly, two nodes are NOT connected if they are independent conditioned on all nodes:
  • \(X_i \!\perp\!\!\!\perp X_j \mid \boldsymbol{X}^{-(i,j)} = \boldsymbol{x}^{-(i,j)} \iff (i,j) \not\in E\)
  • A node separates two nodes if it on all paths from one node to another
  • Assumption: only pairwise effects
  • No equivalent models!
    • Clear saturated model is a fully connected network
  • Naturally cyclic!

  • \(B\) separates \(A\) and \(C\)
  • \(A \!\perp\!\!\!\perp C \mid B\)

  • Worrying and fatigue separate Insomnia and Concentration

Predictive Effects

If this model is the generating model, does:

  • \(A\) predict \(B\)?
    • Yes!
  • \(B\) predict \(A\)?
    • Yes!
  • \(A\) predict \(B\) just as well as \(B\) predict \(A\)?
    • Using linear or logistic regression, yes!

# Generate data (binary):
A <- sample(c(0,1), 10000, replace = TRUE)
B <- 1 * (runif(10000) < ifelse(A==1, 0.8, 0.2))

# Predict A from B (logistic bregression):
AonB <- glm(A ~ B, family = "binomial")
coef(AonB)
## (Intercept)           B 
##   -1.369453    2.757481
# Predict B from A (logistic regression):
BonA <- glm(B ~ A, family = "binomial")
coef(BonA)
## (Intercept)           A 
##   -1.363489    2.757481
  • The logistic regression parameters are equal!

Predictive Effects

  • \(A\) predicts \(B\) and \(B\) predicts \(A\)

Predictive Effects

If this model is the generating model, does:

  • \(A\) predict \(C\) or vise versa?
    • Yes, they are correlated
  • \(A\) predict \(C\) or vise versa when also taking \(B\) into account?
    • No!
  • In a multiple (logistic) regression, \(C\) should not predict \(A\) when \(B\) is also taken as predictor

# Generate data (Gaussian):
A <- rnorm(10000)
B <- A + rnorm(10000)
C <- B + 2*rnorm(10000)

# Predict A from C:
AonC <- lm(A ~ C)
summary(AonC)
## 
## Call:
## lm(formula = A ~ C)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.2389 -0.6162  0.0059  0.6132  3.3662 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.003797   0.009093  -0.418    0.676    
## C            0.170163   0.003742  45.469   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9092 on 9998 degrees of freedom
## Multiple R-squared:  0.1713, Adjusted R-squared:  0.1713 
## F-statistic:  2067 on 1 and 9998 DF,  p-value: < 2.2e-16

# Predict A from B and C:
AonBC <- lm(A ~ B + C)
summary(AonBC)
## 
## Call:
## lm(formula = A ~ B + C)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.66254 -0.47831 -0.00956  0.47492  2.53815 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.005440   0.007056  -0.771    0.441    
## B            0.494626   0.006086  81.269   <2e-16 ***
## C            0.003341   0.003556   0.939    0.348    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7056 on 9997 degrees of freedom
## Multiple R-squared:  0.501,  Adjusted R-squared:  0.5009 
## F-statistic:  5019 on 2 and 9997 DF,  p-value: < 2.2e-16

  • \(A\) predicts \(B\) better than \(C\) predicts \(B\)
  • The relationship between \(A\) and \(C\) is mediated by \(B\)

Conditional Independency

Conditional Independency

Conditional Independency

Conditional Independency

  • A MRF can not represent the exact implied independence relationship of a collider structure
    • Three edges are needed instead of two
  • However, exogenous variables are commonly modeled to be correlated anyway

  • If we could condition on \(\eta\)

  • If we could not condition on \(\eta\)
  • Equivalent models
    • Data generated as a cluster of interacting components can fit a factor model perfectly!

The Ising Model

The Ising Model

The Ising Model Probability distribution

\[ \Pr\left( \boldsymbol{X} = \boldsymbol{x} \right) = \frac{1}{Z} \exp \left( \sum_i \tau_i x_i + \sum_{<ij>} \omega_{ij} x_i x_j \right) \]

  • All \(X\) variables can typically take the values \(-1\) and \(1\)
  • \(\tau_i\) is called the threshold parameter and denotes the tendency for node \(i\) to be in some state
  • \(\omega_{ij}\) is called the network parameter and denotes the preference for nodes \(i\) and \(j\) to be in the same state
    • Edge weights
  • \(Z\) is a normalizing constant (partition function) and takes the sum over all possible configurations of \(\pmb{X}\):
    • \(Z = \sum_{\boldsymbol{x}} \exp \left( \sum_i \tau_i x_i + \sum_{<ij>} \omega_{ij} x_i x_j \right)\)

In matrix form the Ising Model probability distribution is proportional to: \[ \Pr\left(\pmb{X} = \pmb{x}\right) \propto \exp\left( \pmb{\tau}^\top \pmb{x} + \frac{1}{2} \pmb{x}^\top\pmb{\Omega}\pmb{x} \right) \]

  • \(\pmb{x}\) is a vector of binary variables (\(-1\) or \(1\))
  • \(\pmb{\tau}\) is a vector containing threshold parameters
  • \(\pmb{\Omega}\) is a matrix containing the network parameters and an arbitrary diagonal
    • Weights matrix that encodes a network
      • \(\omega_{ij} = 0\) means that there is no edge between nodes \(i\) and \(j\)
      • Positive weights are comparable in strength to negative weights

\[ \boldsymbol{\Omega} = \begin{bmatrix} 0 & \omega_{12} & 0\\ \omega_{12} & 0 & \omega_{23}\\ 0 & \omega_{23} & 0\\ \end{bmatrix}, \boldsymbol{\tau} = \begin{bmatrix} \tau_1 \\ \tau_2 \\ \tau_3 \end{bmatrix} \]

\[ \boldsymbol{\Omega} = \begin{bmatrix} 0 & 0.5 & 0\\ 0.5 & 0 & 0.5\\ 0 & 0.5 & 0\\ \end{bmatrix}, \boldsymbol{\tau} = \begin{bmatrix} -0.1 \\ -0.1 \\ -0.1 \end{bmatrix} \]

Conditional Distribution for the Ising Model

The conditional distribution for node \(i\) given that we observe all other nodes is: \[ \Pr\left(X_i = x_i \mid \pmb{X}^{-(i)} = \pmb{x}^{-(i)} \right) \propto \exp\left( \left(\tau_i + \sum_{j,j\not=i} \omega_{ij} x_j \right) x_i\right) \]

  • This is a multiple logistic regression model!
  • The most common model to predict the value of one binary variable (dependent variable) given a set of other variables (independent variables)
  • The Ising model is a combination of predictive models. Edges represent how strongly one node predicts another
    • A path means that the predictive strength of one node on another is mediated

Ising Model Estimation

Ising Model Estimation

\[ \Pr(X_1 = 1) \propto \exp\left( \tau_1 + \beta_{12} x_2 + \beta_{13} x_3 + \beta_{14} x_4 \right) \]

Ising Model Estimation

\[ \Pr(X_2 = 1) \propto \exp\left( \tau_2 + \beta_{21} x_1 + \beta_{23} x_3 + \beta_{24} x_4 \right) \]

Ising Model Estimation

\[ \Pr(X_3 = 1) \propto \exp\left( \tau_3 + \beta_{31} x_1 + \beta_{32} x_2 + \beta_{34} x_4 \right) \]

Ising Model Estimation

\[ \Pr(X_4 = 1) \propto \exp\left( \tau_4 + \beta_{41} x_1 + \beta_{42} x_2 + \beta_{43} x_3 \right) \]

Ising Model Estimation

Ising Model Estimation

\[ \omega_{ij} = \beta_{ij} = \beta_{ji} \]

Continous Data

If \(\pmb{x}\) is not binary but assumed Gaussian we can use a multivariate Gaussian distribution: \[ f\left( \pmb{X} = \pmb{x} \right) = \frac{1}{\sqrt{(2\pi)^k|\boldsymbol\Sigma|}} \exp\left(-\frac{1}{2}({\boldsymbol x}-{\boldsymbol\mu})^{\top}{\boldsymbol\Sigma}^{-1}({\boldsymbol x}-{\boldsymbol\mu}) \right) \]

  • \(\pmb{\mu}\) is a vector that encodes the means
  • \(\boldsymbol{\Sigma}\) is the variance-covariance matrix
  • Now we can rearrange:
    • \(f\left( \pmb{X} = \pmb{x} \right) \propto \exp\left(-\frac{1}{2}({\boldsymbol x}-{\boldsymbol\mu})^{\top}{\boldsymbol\Sigma}^{-1}({\boldsymbol x}-{\boldsymbol\mu})\right)\)
    • \(f\left( \pmb{X} = \pmb{x} \right) \propto \exp\left(-\frac{1}{2}( {\boldsymbol x}^\top {\boldsymbol\Sigma}^{-1} {\boldsymbol x} - {\boldsymbol x}^\top {\boldsymbol\Sigma}^{-1} {\boldsymbol \mu} - {\boldsymbol \mu}^\top {\boldsymbol\Sigma}^{-1}{\boldsymbol x} + {\boldsymbol \mu}^\top{\boldsymbol\Sigma}^{-1}{\boldsymbol \mu})\right)\)
    • \(f\left( \pmb{X} = \pmb{x} \right) \propto \exp\left( {\boldsymbol \mu}^\top {\boldsymbol\Sigma}^{-1}{\boldsymbol x} -\frac{1}{2} {\boldsymbol x}^\top {\boldsymbol\Sigma}^{-1} {\boldsymbol x}\right)\)

Gaussian Graphical Model

Reparameterizing \(\pmb{\tau} = {\boldsymbol \mu}^\top {\boldsymbol\Sigma}^{-1}\) and \(\pmb{\Omega} = -\boldsymbol{\Sigma}^{-1}\) we obtain the following expression for the Multivariate Normal Distribution:

\[ f\left(\pmb{x}\right) \propto \exp\left( \pmb{\tau}^\top \pmb{x} + \frac{1}{2} \pmb{x}^\top\pmb{\Omega}\pmb{x} \right) \]

  • Exactly the same form as the Ising Model!!!
  • Except:
    • \(\pmb{x}\) is now continuous
    • The normalizing constant is different
  • The multivariate normal distribution encodes a network
    • This network is called a Gaussian Graphical Model (GGF), Concentration Graph, Gaussian Random Field or Partial Correlation Network.

Gaussian Graphical Model

  • The negative inverse covariance matrix of the multivariate normal distribution, also called precision matrix encodes a network
  • Through mathematical magic, standardized elements of the negative inverse precision matrix are equal partial correlation coefficients between nodes conditioned on all other nodes
    • \(\rho_{ij} = \omega_{ij} / \sqrt{\omega_{ii}\omega_{jj}} = \mathrm{Cor}\left( X_i, X_j \mid \pmb{X}^{-(i,j)} = \pmb{x}^{-(i,j)} \right)\)
  • Typically these partial correlation coefficients are used in the weights matrix of the corresponding network structure
    • There is no edge between nodes \(i\) and \(j\) if \(\rho_{ij} = 0\)
    • Which clearly corresponds to the Markov property that two nodes are then independent conditioned on all other nodes in the network

Conditional Distribution for the Gaussian Graphical Model

If we observe the value of every node except node \(i\) we obtain the following (assuming all variables are standardized):

\[ \begin{aligned} X_i \mid \pmb{X}^{-(i)} \sim N\left( \sum_{j\not=i} \frac{\omega_{ij}}{\theta^2_i} x_j, \theta^2_{i} \right) \end{aligned} \]

In which \(\theta^2_{i}\) is the \(i\)th diagonal element of the precision matrix, also called the residual variance.

  • This is a multiple linear regression model!
  • Similar to the Ising model, the Gaussian Graphical Model shows how well nodes predict each other!

Gaussian Graphical Model Estimation

Gaussian Graphical Model Estimation

\[ X_1 = \tau_1 + \beta_{12} x_2 + \beta_{13} x_3 + \beta_{14} x_4 + \varepsilon_1 \]

Gaussian Graphical Model Estimation

\[ X_2 = \tau_2 + \beta_{21} x_1 + \beta_{23} x_3 + \beta_{24} x_4 + \varepsilon_2 \]

Gaussian Graphical Model Estimation

\[ X_3 = \tau_3 + \beta_{31} x_1 + \beta_{32} x_2 + \beta_{34} x_4 + \varepsilon_3 \]

Gaussian Graphical Model Estimation

\[ X_4 = \tau_4 + \beta_{41} x_1 + \beta_{42} x_2 + \beta_{43} x_3 + \varepsilon_4 \]

Gaussian Graphical Model Estimation

Gaussian Graphical Model Estimation

\[ \rho_{ij} = \frac{\beta_{ij} \mathrm{Var}\left(\varepsilon_j\right)}{\mathrm{Var}\left(\varepsilon_i\right)} = \frac{\beta_{ji} \mathrm{Var}\left(\varepsilon_i\right)}{\mathrm{Var}\left(\varepsilon_j\right)} \]

Gaussian Graphical Model Estimation

\[ \rho_{ij} = -\frac{ \kappa_{ij} }{\sqrt{ \kappa_{ii} \kappa_{jj} }} \hspace{35pt} \boldsymbol{K} = \boldsymbol{\Sigma}^{-1} \]

In which \(\boldsymbol{K}\) is the precision matrix; the inverse of the variance-covariance matrix.

What is data is not Gaussian?

  • Continuous but not Gaussian
    • Use the "nonparanormal transformation" (Zhao et al. 2012) and compute GGM
  • All nodes are ordinal
    • Compute polychoric correlations (cor_auto() in qgraph) and compute GGM
    • Binarize values and compute Ising model using IsingFit (Borkulo et al. 2014)
  • Some nodes are ordinal
    • Transform continuous nodes using nonparanormal
    • Compute polychoric and polyserial correlations (cor_auto() in qgraph)

Model Selection

  • We are specifically interested in identifying which partial correlations are zero
    • These elements correspond to missing edges in the graph
  • We could identify zeroes by significance testing or stepwise model-selection, but…
    • Problem of multiple comparison
    • Intractable model space
    • \(p\)-values and NHST are evil
  • A well accepted approach for jointly model selection and parameter estimation is the least absolute shrinkage and selection operator (LASSO)
    • Penalized maximum likelihood estimation
    • For the Gaussian Graphical Model, a variant called the Graphical Lasso (glasso) exists that applies LASSO directly to the inverse covariance matrix
  • LASSO also controls for inflated standard errors when there is limited data!

LASSO Estimation

\[ Y = \beta_1 x_1 + \beta_2 x_2 + \varepsilon_Y \]

##       [,1] [,2] [,3] [,4]
##  [1,]    5    4    3    5
##  [2,]    5    5    4    5
##  [3,]    2    4    5    7
##  [4,]    6    7    7    4
##  [5,]    4    4    5    6
##  [6,]    4    2    4    4
##  [7,]    3    4    4    5
##  [8,]    4    3    2    6
##  [9,]    7    6    4    3
## [10,]    5    3    4    5
## [11,]    5    4    3    6
## [12,]    4    5    5    5
## [13,]    4    5    5    5
## [14,]    8    9    7    4
## [15,]    3    5    4    6
## [16,]    2    3    3    7
## [17,]    8    8    8    2
## [18,]    6    4    6    3
## [19,]    2    2    1    9
## [20,]    1    2    1    9

\[ \lambda = 0 \]

\[ \lambda = 0.1 \]

\[ \lambda = 0.25 \]

\[ \lambda = 0.3 \]

\[ \lambda = 0.4 \]

\[ \lambda = 0.5 \]

\[ \lambda = 1 \]

\[ \lambda = 2 \]

\[ \lambda = 3 \]

\[ \lambda = 4 \]

How to choose which \(\lambda\) to use?

How to choose which \(\lambda\) to use?

Find a sparse concentration graph using glasso:

qgraph(cor(Data), graph = "glasso", sampleSize = nrow(Data))

Emperical example: personality

I will analyze the BFI dataset from the pych package:

# Load BFI data:
library("psych")
data(bfi)
bfi <- bfi[,1:25]

# Correlation Matrix:
library("qgraph")
bfiCors <- cor_auto(bfi)

# Groups and names objects:
Names <- scan("http://sachaepskamp.com/files/BFIitems.txt",what = "character", sep = "\n")
Groups <- rep(c('A','C','E','N','O'),each=5)

Agreeableness

Am indifferent to the feelings of others.

Inquire about others' well-being.

Know how to comfort others.

Love children.

Make people feel at ease.

Conscientiousness

Am exacting in my work.

Continue until everything is perfect.

Do things according to a plan.

Do things in a half-way manner.

Waste my time.

Extraversion

Don't talk a lot.

Find it difficult to approach others.

Know how to captivate people.

Make friends easily.

Take charge.

Neurotocism

Get angry easily.

Get irritated easily.

Have frequent mood swings.

Often feel blue.

Panic easily.

Openess to Experience

Am full of ideas.

Avoid difficult reading material.

Carry the conversation to a higher level.

Spend time reflecting on things.

Will not probe deeply into a subject.

# GGM without regularization
qgraph(bfiCors, layout = "spring", nodeNames = Names, 
       groups = Groups, legend.cex =0.3, graph = "pcor")

# GGM with LASSO / EBIC 
qgraph(bfiCors, layout = "spring", nodeNames = Names, 
      groups = Groups, legend.cex =0.3, graph = "glasso", 
      sampleSize = nrow(bfi))

# Binary data:
library("bootnet")
bfiBinary <- binarize(bfi, mean)
## Warning in binarize(bfi, mean): Splitting data by mean

# Ising model without regularization
library("IsingSampler")
res <- EstimateIsing(bfiBinary, method = "uni")

qgraph(res$graph, layout = "spring", nodeNames = Names, 
       groups = Groups, legend.cex = 0.3, labels = colnames(bfi))

# Ising model with LASSO / EBIC
library("IsingFit")
res2 <- IsingFit(bfiBinary, plot = FALSE, progressbar = FALSE)

qgraph(res2$weiadj, layout = "spring", nodeNames = Names, 
       groups = Groups, legend.cex = 0.3)

  • If we could not condition on \(\eta\)
  • What if local independence is violated, but the latent variable model is still true?

Residual Interaction Modeling

Residual Interaction Modeling

In Residual Interaction Modeling (RIM), we use the regular Structural Equation Modeling (SEM) framework: \[ \pmb{\Sigma} = \pmb{\Lambda} \left( \pmb{I} - \pmb{B} \right)^{-1} \pmb{\Psi} \left( \pmb{I} - \pmb{B} \right)^{-1 \top} \pmb{\Lambda}^{\top} + \pmb{\Theta} \] And allow \(\pmb{\Theta}\) (possibly \(\pmb{\Psi}\) as well) to be modeled as a Gaussian Graphical Model

  • \(\pmb{\Theta}\) can be modeled to be dense while retaining positive degrees of freedom!
    • Allows for the estimation of a factor structure when local independence is structurally violated
  • Confirmatory testing is being implemented in the rim package (github.com/SachaEpskamp/rim)
    • Directly allows modeling of partial correlation networks
    • Without latent factors, this allows for confirmatory network fitting

\[ \pmb{\Sigma} = \pmb{\Lambda} \pmb{\Psi} \pmb{\Lambda}^{\top} + \pmb{\Theta} \] - Here I focus on exploratory estimation of the exploratory factor analysis model with residual network: - \(\pmb{B} = \pmb{0}\) - Dense \(\pmb{\Lambda}\) - Sparse \(\pmb{\Theta}^{-1}\)

Combining \(\pmb{y}\) and \(\pmb{\eta}\) in the same vector: \[ \pmb{u} = \begin{bmatrix} \pmb{y} \\ \pmb{\eta} \end{bmatrix} \sim N\left( \pmb{0}, \pmb{\Sigma}_{\pmb{u}} \right) \\ \]

Combining \(\pmb{y}\) and \(\pmb{\eta}\) in the same vector: \[ \pmb{u} = \begin{bmatrix} \pmb{y} \\ \pmb{\eta} \end{bmatrix} \sim N\left( \pmb{0}, \pmb{\Sigma}_{\pmb{u}} \right) \\ \]

And defining: \[ \pmb{A} = \begin{bmatrix} \pmb{0} & \pmb{\Lambda} \\ \pmb{0} & \pmb{0} \end{bmatrix}, \pmb{S} = \begin{bmatrix} \pmb{\Theta} & \pmb{0} \\ \pmb{0} & \pmb{\Psi} \end{bmatrix} \]

Combining \(\pmb{y}\) and \(\pmb{\eta}\) in the same vector: \[ \pmb{u} = \begin{bmatrix} \pmb{y} \\ \pmb{\eta} \end{bmatrix} \sim N\left( \pmb{0}, \pmb{\Sigma}_{\pmb{u}} \right) \\ \]

And defining: \[ \pmb{A} = \begin{bmatrix} \pmb{0} & \pmb{\Lambda} \\ \pmb{0} & \pmb{0} \end{bmatrix}, \pmb{S} = \begin{bmatrix} \pmb{\Theta} & \pmb{0} \\ \pmb{0} & \pmb{\Psi} \end{bmatrix} \]

The model becomes: \[ \pmb{\Sigma}_{\pmb{u}} = \left( \pmb{I} - \pmb{A} \right)^{-1} \pmb{S} \left( \pmb{I} - \pmb{A} \right)^{-1\top} \]

Combining \(\pmb{y}\) and \(\pmb{\eta}\) in the same vector: \[ \pmb{u} = \begin{bmatrix} \pmb{y} \\ \pmb{\eta} \end{bmatrix} \sim N\left( \pmb{0}, \pmb{\Sigma}_{\pmb{u}} \right) \\ \]

And defining: \[ \pmb{A} = \begin{bmatrix} \pmb{0} & \pmb{\Lambda} \\ \pmb{0} & \pmb{0} \end{bmatrix}, \pmb{S} = \begin{bmatrix} \pmb{\Theta} & \pmb{0} \\ \pmb{0} & \pmb{\Psi} \end{bmatrix} \]

The model becomes: \[ \pmb{\Sigma}_{\pmb{u}} = \left( \pmb{I} - \pmb{A} \right)^{-1} \pmb{S} \left( \pmb{I} - \pmb{A} \right)^{-1\top} \] \[ \pmb{\Omega}_{\pmb{u}} = \pmb{\Sigma}^{-1}_{\pmb{u}} = \left( \pmb{I} - \pmb{A} \right)^{\top} \pmb{S}^{-1} \left( \pmb{I} - \pmb{A} \right) \]

\[ \pmb{\Omega}_{\pmb{u}} = \pmb{\Sigma}^{-1}_{\pmb{u}} = \left( \pmb{I} - \pmb{A} \right)^{\top} \pmb{S}^{-1} \left( \pmb{I} - \pmb{A} \right) \]

\[ \pmb{\Omega}_{\pmb{u}} = \pmb{\Sigma}^{-1}_{\pmb{u}} = \left( \pmb{I} - \pmb{A} \right)^{\top} \pmb{S}^{-1} \left( \pmb{I} - \pmb{A} \right) \] \[ = \begin{bmatrix} \pmb{I} & \pmb{0} \\ -\pmb{\Lambda}^\top & \pmb{I} \end{bmatrix} \begin{bmatrix} \pmb{\Theta}^{-1} & \pmb{0} \\ \pmb{0} & \pmb{\Psi}^{-1} \end{bmatrix} \begin{bmatrix} \pmb{I} & -\pmb{\Lambda} \\ \pmb{0}& \pmb{I} \end{bmatrix} \]

\[ \pmb{\Omega}_{\pmb{u}} = \pmb{\Sigma}^{-1}_{\pmb{u}} = \left( \pmb{I} - \pmb{A} \right)^{\top} \pmb{S}^{-1} \left( \pmb{I} - \pmb{A} \right) \] \[ = \begin{bmatrix} \pmb{I} & \pmb{0} \\ -\pmb{\Lambda}^\top & \pmb{I} \end{bmatrix} \begin{bmatrix} \pmb{\Theta}^{-1} & \pmb{0} \\ \pmb{0} & \pmb{\Psi}^{-1} \end{bmatrix} \begin{bmatrix} \pmb{I} & -\pmb{\Lambda} \\ \pmb{0}& \pmb{I} \end{bmatrix} \] \[ = \begin{bmatrix} \pmb{\Theta}^{-1} & -\pmb{\Theta}^{-1}\pmb{\Lambda} \\ -\pmb{\Lambda}^\top\pmb{\Theta}^{-1} & \pmb{\Psi}^{-1} + \pmb{\Lambda}^\top\pmb{\Theta}^{-1}\pmb{\Lambda} \end{bmatrix} \]

\[ \pmb{\Omega}_{\pmb{u}} = \pmb{\Sigma}^{-1}_{\pmb{u}} = \left( \pmb{I} - \pmb{A} \right)^{\top} \pmb{S}^{-1} \left( \pmb{I} - \pmb{A} \right) \] \[ = \begin{bmatrix} \pmb{I} & \pmb{0} \\ -\pmb{\Lambda}^\top & \pmb{I} \end{bmatrix} \begin{bmatrix} \pmb{\Theta}^{-1} & \pmb{0} \\ \pmb{0} & \pmb{\Psi}^{-1} \end{bmatrix} \begin{bmatrix} \pmb{I} & -\pmb{\Lambda} \\ \pmb{0}& \pmb{I} \end{bmatrix} \] \[ = \begin{bmatrix} \pmb{\Theta}^{-1} & -\pmb{\Theta}^{-1}\pmb{\Lambda} \\ -\pmb{\Lambda}^\top\pmb{\Theta}^{-1} & \pmb{\Psi}^{-1} + \pmb{\Lambda}^\top\pmb{\Theta}^{-1}\pmb{\Lambda} \end{bmatrix} \] \[ = \begin{bmatrix} \pmb{\Omega}_{\pmb{u}}^{(11)} & \pmb{\Omega}_{\pmb{u}}^{(12)} \\ \pmb{\Omega}_{\pmb{u}}^{(21)} & \pmb{\Omega}_{\pmb{u}}^{(22)} \end{bmatrix} \]

  • \(\pmb{\Omega}_{\pmb{u}}^{(11)}\) is sparse
  • \(\pmb{\Omega}_{\pmb{u}}^{(12)} = \pmb{\Omega}_{\pmb{u}}^{(21)\top}\)

  • \(\pmb{\Omega}_{\pmb{u}}\) is the inverse of a covariance matrix of both observed and unobserved variables
  • Thus, \(\pmb{\Omega}_{\pmb{u}}\) is itself a Gaussian graphical model in which the amount of interactions between observed variables is sparse
  • Estimation of such a model has been worked out by Chandrasekaran, Parrilo, and Willsky (2010)
  • In a series of paper discussing the work of Chandrasekaran, Parrilo, and Willsky, Yuan described a combination of the glasso algorithm and the EM-algorithm to similarly estimate this model
    • This algorithm was called the lvglasso
    • Uses the glasso package in R, but was not yet implemented itself in R

Estimating model parameters

  • Run lvglasso to estimate \(\pmb{\Omega}_{\pmb{u}}\)
  • \(\pmb{\Theta}^{-1} = \pmb{\Omega}_{\pmb{u}}^{(11)}\)
  • \(\pmb{\Lambda} = - \pmb{\Theta} \pmb{\Omega}_{\pmb{u}}^{(12)}\)
  • \(\pmb{\Psi} = \left(\pmb{\Omega}_{\pmb{u}}^{(22)} - \pmb{\Lambda}^\top\pmb{\Theta}^{-1}\pmb{\Lambda}\right)^{-1}\)

library("devtools")
install_github("sachaepskamp/rim")
library("rim")
Res <- EBIClvglasso(bfiCors, nrow(bfi), 5)

plot(Res, "network")

Factor loadings

plot(Res, "loadings", rotation = promax)

Residual interactions

plot(Res, "residpcors", nodeNames = Names,
     groups = Groups, legend.cex =0.3)

Longitudinal Analysis

Longitudinal Analysis

  • In longitudinal analysis, we measure persons over time
  • Experience sampling method
  • Often a few times per day
  • In psychology, data is limited
    • You can't measure a person too often
    • You can't measure a person too long when assuming a stable structure
  • Solutions:
    • Clinical setting: LASSO
    • Scientific setting: Multi-level
    • Allowing parameters to change over time (work by Laura Bringmann)

.

Solutions to the Limited Data Problem

  • Only model temporal effects between consecutive measurements
    • Lag-1
  • Assume both the temporal and contemporaneous effects are sparse
    • Only a relatively little amount of edges in both networks
  • To do this, we use the graphical VAR model (Wild et al. 2010)
    • Estimation via LASSO regularization, using BIC to select optimal tuning parameter (Rothman, Levina, and Zhu 2010; Abegaz and Wit 2013).
  • We implemented these methods in the R package graphicalVAR (cran.r-project.org/package=graphicalVAR)

Graphical VAR

\[ \begin{aligned} \pmb{y}_t &= \pmb{B} \pmb{y}_{t-1} + \pmb{\varepsilon}_t \\ \pmb{\varepsilon}_t &\sim N(\pmb{0}, \pmb{K}^{-1}) \\ \mathrm{Cov}\left( \pmb{\varepsilon}_t, \pmb{\varepsilon}_{t+i} \right) &= \pmb{O} \, \forall i \not= 0 \end{aligned} \]

  • LASSO is used to regularize both \(\pmb{B}\) and \(\pmb{K}\)
  • Afterwards, both matrices are standardized

Empirical Example

Data collected by Date C. Van der Veen, in collaboration with Harriette Riese en Renske Kroeze.

  • Patient suffering from panic disorder and depressive symptoms
    • Perfectionist
  • Measured over a period of two weeks
  • Five times per day
  • Items were chosen after intake together with therapist

Feeling worthless interacts with feeling helpless

Feeling stressed interacts with feeling the need to do things

Central node: Feeling sad

Cycle of enjoyment, feeling sad, feeling worthless and being active

Having to had to do things leads to letting important things pass

Fear of panic attack is not connected

Simulation Study

Simulation Study

Recommendations

Graphical VAR can be used in a clinical setting with:

  • At least 30 measurements
    • More than two measurements per day for a two-week period
  • At most 10 nodes
    • Unless the amount of measurements is high

graphicalVAR

library("graphicalVAR")

# Real matrices:
Kappa <- diag(1,3,3)
Kappa[1,2] <- Kappa[2,1] <- Kappa[2,3] <- Kappa[3,2] <- -0.5
Beta <- diag(0.4,3,3)
Beta[1,3] <- -0.5

# Simulate data:
set.seed(1)
Data <- graphicalVARsim(100,Beta,Kappa)

# Estimate model:
Res <- graphicalVAR(Data, gamma = 0, nLambda = 20, verbose = FALSE)

# Plot results:
plot(Res)

Beta
##      [,1] [,2] [,3]
## [1,]  0.4  0.0 -0.5
## [2,]  0.0  0.4  0.0
## [3,]  0.0  0.0  0.4
round(Res$beta,2)
##      [,1] [,2]  [,3]
## [1,] 0.31 0.00 -0.40
## [2,] 0.00 0.22  0.00
## [3,] 0.00 0.00  0.14

Kappa
##      [,1] [,2] [,3]
## [1,]  1.0 -0.5  0.0
## [2,] -0.5  1.0 -0.5
## [3,]  0.0 -0.5  1.0
round(Res$kappa,2)
##       [,1]  [,2]  [,3]
## [1,]  1.98 -0.79  0.00
## [2,] -0.79  1.71 -0.65
## [3,]  0.00 -0.65  1.37

Multi-level VAR

  • When multiple persons are measured over a period of time, multi-level VAR can be used
  • Estimates fixed effects temporal networks as well as variation due to individual differences
  • Explained by Bringmann et al. (2013)
  • Implemented in mlVAR (https://github.com/SachaEpskamp/mlVAR)

Multi-level VAR

Lag-1 model

Level 1: \[ \pmb{y}_t^{(p)} = \pmb{B}^{(p)} \pmb{y}_ {t-1}^{(p)} + \pmb{\varepsilon}_t^{(p)} \]

Level 2: \[ \begin{aligned} \pmb{\beta}_ {ij}^{(p)} &= b_{ij} + u^{(p)}_{ij} \\ u^{(p)}_{ij} &\sim N(0, \sigma_{ij}) \end{aligned} \]

Multi-level VAR

Lag-2 model

Level 1: \[ \pmb{y}_t^{(p)} = \pmb{B}_1^{(p)} \pmb{y}_ {t-1}^{(p)} + \pmb{B}_2^{(p)} \pmb{y}_ {t-2}^{(p)} + \pmb{\varepsilon}_t^{(p)} \]

Level 2: \[ \begin{aligned} \pmb{\beta}_ {lij}^{(p)} &= b_{lij} + u^{(p)}_{lij} \\ u^{(p)}_{lij} &\sim N(0, \sigma_{lij}) \end{aligned} \]

Multi-level VAR

Treatment effects

Level 1: \[ \pmb{y}_t^{(p)} = \pmb{B}^{(p)} \pmb{y}_ {t-1}^{(p)} + \pmb{\varepsilon}_t^{(p)} \]

Level 2: \[ \begin{aligned} \pmb{\beta}_ {ij}^{(p)} &= b_{1ij} + d^{(p)}_{post} b_{2ij} + d^{(p)}_{treatment} b_{3ij} + u^{(p)}_{ij} \\ u^{(p)}_{ij} &\sim N(0, \sigma_{ij}) \end{aligned} \]

Here patients are measured before and after treatment, in which they either got real or placebo treatment. \(d^{(p)}_{post}\) is a dummy encoding \(0\) for time points before treatment and \(1\) for time points after treatment and \(d^{(p)}_{treatment}\) is a dummy encoding \(1\) if someone received treatment and \(0\) otherwise.

library("mlVAR")
# True L1 structure:
L1 <- matrix(c(
  0.5,0.25,0,
  0,0.5,0.25,
  0,0,0.5),3,3,byrow=TRUE)

# True L2 structure:
L2 <- matrix(c(
  0.1,0,-0.2,
  0,0.1,0,
  0,0,0.1),3,3,byrow=TRUE)

# Error variance:
error <- 0.1

# Number of subjects:
Np <- 10

# Number of measurements per subject:
Nt <- 30

set.seed(1)
# Generate random effects:
L1_RF <- lapply(1:Np, function(x) L1 + rnorm(prod(dim(L1)),0,error))
L2_RF <- lapply(1:Np, function(x) L2 + rnorm(prod(dim(L1)),0,error))

# Generate data:
Data <- do.call(rbind,lapply(1:Np, function(p) {
  subjectData <- simulateVAR(list(L1_RF[[p]], L2_RF[[p]]), 1:2, 100)
  names(subjectData) <- paste0("x",1:3)
  subjectData$ID <- p
  subjectData
}))

# Run analysis:
Res <- mlVAR(Data, 
             vars = c("x1","x2","x3"), 
             idvar = "ID", 
             lags = c(1, 2))

library("qgraph")
layout(t(1:2))
qgraph(t(L1), labels = paste0('x',1:3), layout = "circle", title = "True Lag-1", 
       diag = TRUE, mar = c(8,8,8,8))
plot(Res, "fixed", lag=1, labels = paste0('x',1:3), layout = "circle", 
     title = "Estimated Lag-1", mar = c(8,8,8,8))

layout(t(1:2))
qgraph(t(L2), labels = paste0('x',1:3), layout = "circle", title = "True Lag-2", 
       diag = TRUE, mar = c(8,8,8,8))
plot(Res, "fixed", lag=2, labels = paste0('x',1:3), layout = "circle", 
     title = "Estimated Lag-2", mar = c(8,8,8,8))

Network Psychometrics Ecosystem

Thank you for your attention!

References

Abegaz, Fentaw, and Ernst Wit. 2013. “Sparse Time Series Chain Graphical Models for Reconstructing Genetic Networks.” Biostatistics. Biometrika Trust, kxt005.

Borkulo, Claudia D van, Denny Borsboom, Sacha Epskamp, Tessa F Blanken, Lynn Boschloo, Robert A Schoevers, and Lourens J Waldorp. 2014. “A New Method for Constructing Networks from Binary Data.” Scientific Reports 4. Nature Publishing Group: Article number: 5918.

Bringmann, Laura F, Nathalie Vissers, Marieke Wichers, Nicole Geschwind, Peter Kuppens, Frenk Peeters, Denny Borsboom, and Francis Tuerlinckx. 2013. “A Network Approach to Psychopathology: New Insights into Clinical Longitudinal Data.” PloS One 8 (4). Public Library of Science: e60188.

Epskamp, Sacha, Ang élique OJ Cramer, Lourens J Waldorp, Verena D Schmittmann, and Denny Borsboom. 2012. “Qgraph: Network Visualizations of Relationships in Psychometric Data.” Journal of Statistical Software 48 (4): 1–18.

Rothman, Adam J, Elizaveta Levina, and Ji Zhu. 2010. “Sparse Multivariate Regression with Covariance Estimation.” Journal of Computational and Graphical Statistics 19 (4). Taylor & Francis: 947–62.

Wild, Beate, Michael Eichler, Hans-Christoph Friederich, Mechthild Hartmann, Stephan Zipfel, and Wolfgang Herzog. 2010. “A Graphical Vector Autoregressive Modelling Approach to the Analysis of Electronic Diary Data.” BMC Medical Research Methodology 10 (1). BioMed Central Ltd: 28.

Zhao, Tuo, Han Liu, Kathryn Roeder, John Lafferty, and Larry Wasserman. 2012. “The Huge Package for High-Dimensional Undirected Graph Estimation in R.” The Journal of Machine Learning Research 13 (1). JMLR. org: 1059–62.