Disclaimer

  • These sheets contain mathematics to illustrate properties of Markov Random Fields (undirected networks)
    • As well as for reference
  • Without mathematical statistics some concepts might be hard to follow
    • And that is ok!
    • I will not ask you to repeat the mathematics The main thing you should take from this lecture is:
    • Understand what Markov Random Fields are
    • Understand what independence relations they imply
    • Understand how the edge weights should be interpreted
    • Understand how they are estimated
      • As we will see on Thursday, the R codes are simple!!!
  • Be there on Thursday!

Recap

Causality

plot of chunk unnamed-chunk-1

Causal structures

plot of chunk unnamed-chunk-2

plot of chunk unnamed-chunk-3 What if we don't know the structure?

plot of chunk unnamed-chunk-4

Are the two nodes independent given any set of other nodes (including the empty set)?

  • Yes! They are independent to begin with!
  • Draw no edge between Easiness of Class and Intelligence

plot of chunk unnamed-chunk-5

Are the two nodes independent given any set of other nodes (including the empty set)?

  • No!
  • Draw an edge between Easiness of Class and Grade

plot of chunk unnamed-chunk-6

Are the two nodes independent given any set of other nodes (including the empty set)?

  • No!
  • Draw an edge between Grade and Intelligence

plot of chunk unnamed-chunk-7

plot of chunk unnamed-chunk-8

Is the middle node in the set that separated the other two nodes?

  • Yes!
  • No nothing

plot of chunk unnamed-chunk-9

Is the middle node in the set that separated the other two nodes?

  • No!
  • Grade is a collider between Easiness of Class and Intelligence

plot of chunk unnamed-chunk-10

  • Do we now know the direction of the edge between Intelligence and IQ?
    • No!

plot of chunk unnamed-chunk-11

  • Do we now know the direction of the edge between Grade and Diploma?
    • Yes! Grade was not a collider!

plot of chunk unnamed-chunk-12

Equivalent models

plot of chunk unnamed-chunk-13

Outline

  • Testing Directed Networks
    • Structural Equation Modeling
    • Problems with directed networks
  • Undirected networks / Markov Random Fields
  • Undirected Networks as generating structures
  • Important Markov Random Fields
    • Ising Model
    • Gaussian Random Field
  • Examples
    • Radicalization
    • Quality of Life

Testing Directed Networks

Directed Acyclic Graphs

  • A DAG implies a set of independence relationships, which can be tested
  • If the data is assumed Multivariate Gaussian:
    • Each variable normally distributed
    • Linear relationships between variables
  • Then the correlation or covariance can be used to test for dependencies and the partial correlation or partial covariance can be used to test for conditional dependencies

plot of chunk unnamed-chunk-14

  • \(\mathrm{Cov}\left( A, C \right) \not=0\)
  • \(\mathrm{Cov}\left( A, C \mid B\right) =0\)

Structural Equation Modeling

  • In SEM, the variance-covariance matrix is modeled and compared to the observed variance-covariance matrix
  • If multivariate normality holds, then the Schur complement shows that any partial covariance can be expressed solely in terms of variances and covariances:
    • \(\mathrm{Cov}\left( Y_i, Y_j \mid X = x \right) = \mathrm{Cov}\left( Y_i, Y_j \right) - \mathrm{Cov}\left( Y_i, X \right) \mathrm{Var}\left( X \right)^{-1} \mathrm{Cov}\left(X, Y_j \right)\)
  • Thus, a specific structure of the correlation matrix also implies a model for all possible partial correlations
    • If the implied covariance matrix of SEM exactly matches the observed covariance matrix, then the data contains all d-separations that are implied by the causal model
    • In that case, the model could have generated the data!
    • But, this does not mean the model is correct
      • Equivalent models could have generated the same data!

Doosje, B., Loseman, A., & Bos, K. (2013). Determinants of radicalization of Islamic youth in the Netherlands: Personal uncertainty, perceived injustice, and perceived group threat. Journal of Social Issues, 69(3), 586-604.

Equivalent model

MacCallum, R. C., Wegener, D. T., Uchino, B. N., & Fabrigar, L. R. (1993). The problem of equivalent models in applications of covariance structure analysis. Psychological bulletin, 114(1), 185.

More equivalent models

More equivalent models

More equivalent models

More equivalent models

More equivalent models

What does pcalg come up with?

Data <- read.csv(text = '
In-group Identification;identification; 4.56; 0.85 
Individual Deprivation;in_dep; 2.39; 0.81 
Collective Deprivation;col_dep; 3.31; 0.92 
Intergroup Anxiety;anxiety; -0.20; 0.17 
Symbolic Threat;sym_threat; 3.46; 0.76 
Realistic Threat;real_threat; 3.10; 0.88 
Personal Emotional Uncertainty;uncertainty; 2.84; 0.67 
Perceived Injustice;injustice; 2.38; 0.68 
Perceived Illegitimacy authorities;illegitemacy; 2.37; 0.02 
Perceived In-group superiority;superiority; 3.26; 0.93 
Distance to Other People;distance; 2.32; 0.66 
Societal Disconnected;disconnected; 2.79; 0.96 
Attitude towards Muslim Violence;muslim_violence; 2.89; 1.06 
Own Violent Intentions;own_violent; 2.08; 0.91
', sep = ";", header = FALSE)
names(Data) <- c("name","var","mean", "sd")

library("lavaan")
corMat <- getCov('
1 -.19 .08  -.25 .42 .07 .08  -.06  -.28 .09  -.17  -.25  -.04  -.07 
1 .49 .36 .23 .50 .21 .50 .25 .12 .17 .21 .12 .09 
1 .11 .54 .62 .26 .38 .21 .31 .18 .09 .20 .10 
1 .01 .15 .19 .21 .35 .08 .22 .26 .18 .14 
1 .64 .21 .24 .07 .39 .01 .04 .17  -.01 
1 .27 .34 .16 .35 .19 .14 .26 .16 
1 .10 .08 .29 .18 .00 .30 .14 
1 .15 .01 .03 .23 .04 .06 
1 .22 .17 .35 .35 .24 
1 .34 .08 .53 .30 
1 .08 .44 .39 
1 .24 .00 
1 .47 
1
',lower=FALSE)
covmat <- cor2cov(corMat, Data$sd)
colnames(covmat) <- rownames(covmat) <- Data$var

library("pcalg")
pc.fit <- pc(suffStat = list(C = covmat, n = 131),
             indepTest = gaussCItest, 
             alpha=0.05, labels = as.character(Data$name))

library("qgraph")
qgraph(pc.fit, vsize = 15, vsize2 = 5, shape = "ellipse", repulsion = 0.7)

plot of chunk unnamed-chunk-19

Does it fit?

Mod <-'
muslim_violence ~ own_violent + distance + disconnected + superiority
disconnected ~ identification
superiority ~ sym_threat
sym_threat ~ col_dep + identification
col_dep ~ in_dep
real_threat ~ in_dep + sym_threat
in_dep ~~ injustice
'

fit <- sem(Mod, sample.cov = covmat, sample.nobs = 131)

Does it fit?

round(fitMeasures(fit)[c('chisq','df', 'pvalue','cfi','nfi','rmsea','rmsea.ci.lower','rmsea.ci.upper')], 2)
##          chisq             df         pvalue            cfi            nfi 
##          80.52          39.00           0.00           0.89           0.81 
##          rmsea rmsea.ci.lower rmsea.ci.upper 
##           0.09           0.06           0.12
  • Not really…

Directed Networks

  • Modifying a causal network until it fits can lead to over-fitting
    • Many equivalent models
    • If your goal is to test a specific hypothesis (as was the goal of Doosje et al.) then SEM is good. For more exploratory research, different methods should be used.
  • Causal search algorithms on the other hand can lack power
    • In my experience rarely an interpretable model comes out
  • Both methods assume the true model is a DAG!
    • The arrows could be misleading, they imply a specific effect of doing something even though we only know what happens if we condition on something
  • Equivalent models are also a huge problem for network analysis

  • Would, theoretically, Do( Attitude towards Muslim Violence ) really not influence any of the predictors?

Equivalent Models and Centrality

plot of chunk unnamed-chunk-22

plot of chunk unnamed-chunk-23

Markov Random Fields

Markov Random Fields

  • A pairwise Markov Random Field (MRF) is an undirected network
  • Two nodes are connected if they are not independent conditional on all other nodes.
  • More importantly, two nodes are NOT connected if they are independent conditioned on all nodes:
  • \(X_i \!\perp\!\!\!\perp X_j \mid \boldsymbol{X}^{-(i,j)} = \boldsymbol{x}^{-(i,j)} \iff (i,j) \not\in E\)
  • A node separates two nodes if it on all paths from one node to another
  • No equivalent models!
    • Clear saturated model is a fully connected network
  • Naturally cyclic!

plot of chunk unnamed-chunk-24

  • \(B\) separates \(A\) and \(C\)
  • \(A \!\perp\!\!\!\perp C \mid B\)

plot of chunk unnamed-chunk-25

  • Worrying and fatigue separate Insomnia and Concentration

Markov Random Fields in Other Sciences

  • MRF's have been used in many disciplines in Science

Interpreting a Markov Random Field

The edges in a MRF can be interpreted in several ways:

  • Predictive effects
  • A representation of conditional independence relationships
  • Pairwise interactions
  • Genuine symmetric relationships between nodes
    • Ising Model

Predictive Effects

plot of chunk unnamed-chunk-26

If this model is the generating model, does:

  • \(A\) predict \(B\)?
    • Yes!
  • \(B\) predict \(A\)?
    • Yes!
  • \(A\) predict \(B\) just as well as \(B\) predict \(A\)?
    • Using linear or logistic regression, yes!

# Generate data (binary):
A <- sample(c(0,1), 10000, replace = TRUE)
B <- 1 * (runif(10000) < ifelse(A==1, 0.8, 0.2))

# Predict A from B (logistic bregression):
AonB <- glm(A ~ B, family = "binomial")
coef(AonB)
## (Intercept)           B 
##      -1.443       2.873
# Predict B from A (logistic regression):
BonA <- glm(B ~ A, family = "binomial")
coef(BonA)
## (Intercept)           A 
##      -1.440       2.873
  • The logistic regression parameters are equal!

Predictive Effects

plot of chunk unnamed-chunk-28

  • \(A\) predicts \(B\) and \(B\) predicts \(A\)

Predictive Effects

plot of chunk unnamed-chunk-29

If this model is the generating model, does:

  • \(A\) predict \(C\) or vise versa?
    • Yes, they are correlated
  • \(A\) predict \(C\) or vise versa when also taking \(B\) into account?
    • No!
  • In a multiple (logistic) regression, \(C\) should not predict \(A\) when \(B\) is also taken as predictor

# Generate data (Gaussian):
A <- rnorm(10000)
B <- A + rnorm(10000)
C <- B + 2*rnorm(10000)

# Predict A from C:
AonC <- lm(A ~ C)
summary(AonC)
## 
## Call:
## lm(formula = A ~ C)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.521 -0.617 -0.012  0.624  3.166 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.01060    0.00914    1.16     0.25    
## C            0.17084    0.00375   45.56   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.914 on 9998 degrees of freedom
## Multiple R-squared:  0.172,  Adjusted R-squared:  0.172 
## F-statistic: 2.08e+03 on 1 and 9998 DF,  p-value: <2e-16

# Predict A from B and C:
AonBC <- lm(A ~ B + C)
summary(AonBC)
## 
## Call:
## lm(formula = A ~ B + C)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9409 -0.4774 -0.0027  0.4785  2.9478 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.00464    0.00711    0.65     0.51    
## B            0.49992    0.00620   80.69   <2e-16 ***
## C            0.00417    0.00358    1.17     0.24    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.711 on 9997 degrees of freedom
## Multiple R-squared:  0.498,  Adjusted R-squared:  0.498 
## F-statistic: 4.97e+03 on 2 and 9997 DF,  p-value: <2e-16

plot of chunk unnamed-chunk-33

  • \(A\) predicts \(B\) better than \(C\) predicts \(B\)
  • The relationship between \(A\) and \(C\) is mediated by \(B\)

Conditional Independency Relationships

plot of chunk unnamed-chunk-34

plot of chunk unnamed-chunk-35

plot of chunk unnamed-chunk-36

plot of chunk unnamed-chunk-37

plot of chunk unnamed-chunk-38

plot of chunk unnamed-chunk-39

plot of chunk unnamed-chunk-40

plot of chunk unnamed-chunk-41

  • A MRF can not represent the exact implied independence relationship of a collider structure
    • Three edges are needed instead of two
  • However, exogenous variables are commonly modeled to be correlated anyway

plot of chunk unnamed-chunk-42

plot of chunk unnamed-chunk-43

  • MRF's can also represent independence relationships that DAG's can not.

plot of chunk unnamed-chunk-44

plot of chunk unnamed-chunk-45

  • If we could condition on \(\eta\)

plot of chunk unnamed-chunk-46

  • If we could not condition on \(\eta\)
  • Equivalent models
    • Data generated as a cluster of interacting components can fit a factor model perfectly!

The Ising Model

The Ising Model

Markov Random Fields as Generating Structure

plot of chunk R

  • In the Ising model, we could hold a magnet in some way—\(\mathrm{Do}( A )\)—which can cause adjacent nodes to "flip" with the same probability if we conditioned on \(A\)
    • \(\Pr\left( B \mid \mathrm{Do}(A) \right) = \Pr\left( B \mid A \right)\)
    • \(\Pr\left( A \mid \mathrm{Do}(B) \right) = \Pr\left( A \mid B \right)\)
  • Symmetric relationship that can not be represented in a DAG
  • Real relationship that occurs in physics

Parameterizing Markov Random Fields

The Ising Model Probability distribution

\[ \Pr\left( \boldsymbol{X} = \boldsymbol{x} \right) = \frac{1}{Z} \exp \left( \sum_i \tau_i x_i + \sum_{<ij>} \omega_{ij} x_i x_j \right) \]

  • All \(X\) variables can typically take the values \(-1\) and \(1\)
  • \(\tau_i\) is called the threshold parameter and denotes the tendency for node \(i\) to be in some state
  • \(\omega_{ij}\) is called the network parameter and denotes the preference for nodes \(i\) and \(j\) to be in the same state
    • Edge weights
  • \(Z\) is a normalizing constant (partition function) and takes the sum over all possible configurations of \(\pmb{X}\):
    • \(Z = \sum_{\boldsymbol{x}} \exp \left( \sum_i \tau_i x_i + \sum_{<ij>} \omega_{ij} x_i x_j \right)\)

In matrix form the Ising Model probability distribution is proportional to: \[ \Pr\left(\pmb{X} = \pmb{x}\right) \propto \exp\left( \pmb{\tau}^\top \pmb{x} + \frac{1}{2} \pmb{x}^\top\pmb{\Omega}\pmb{x} \right) \]

  • \(\pmb{x}\) is a vector of binary variables (\(-1\) or \(1\))
  • \(\pmb{\tau}\) is a vector containing threshold parameters
  • \(\pmb{\Omega}\) is a matrix containing the network parameters and an arbitrary diagonal
    • Weights matrix that encodes a network
      • \(\omega_{ij} = 0\) means that there is no edge between nodes \(i\) and \(j\)
      • Positive weights are comparable in strength to negative weights

plot of chunk unnamed-chunk-47

\[ \boldsymbol{\Omega} = \begin{bmatrix} 0 & \omega_{12} & 0\\ \omega_{12} & 0 & \omega_{23}\\ 0 & \omega_{23} & 0\\ \end{bmatrix}, \boldsymbol{\tau} = \begin{bmatrix} \tau_1 \\ \tau_2 \\ \tau_3 \end{bmatrix} \]

plot of chunk unnamed-chunk-48

\[ \boldsymbol{\Omega} = \begin{bmatrix} 0 & 0.5 & 0\\ 0.5 & 0 & 0.5\\ 0 & 0.5 & 0\\ \end{bmatrix}, \boldsymbol{\tau} = \begin{bmatrix} -0.1 \\ -0.1 \\ -0.1 \end{bmatrix} \]

\[ \Pr\left( \boldsymbol{X} = \boldsymbol{x} \right) = \frac{1}{Z} \exp \left( \sum_i \tau_i x_i + \sum_{<ij>} \omega_{ij} x_i x_j \right) \]

We can compute the unnormalized probability that all nodes are \(1\):

\[ \exp\left( -0.1 + -0.1 + -0.1 + 0.5 + 0.5 \right) = 2.0138 \]

  • We will call this the potential for the nodes to be in this state
  • Summing the potential of every possible state gives the normalizing constant \(Z\)
  • Which can then be used to compute the probabilities

What if we observe \(X_2 = 1\)?

plot of chunk unnamed-chunk-49

What if we observe \(X_2 = 1\)?

What if we observe \(X_2 = 1\)?

  • \((0.2139 + 0.4761) * (0.4761 + 0.2139) = 0.4761\)
  • \(\Pr\left(X_1 = 1, X_3 = 1 \mid X_2 = 1 \right) = \Pr\left(X_1 = 1 \mid X_2 = 1 \right) \Pr\left( X_3 = 1 \mid X_2 = 1 \right)\)
  • \(X_1\) and \(X_3\) are conditionally independent given \(X_2\)!

Conditional Distribution for the Ising Model

The conditional distribution for node \(i\) given that we observe all other nodes is: \[ \Pr\left(X_i = x_i \mid \pmb{X}^{-(i)} = \pmb{x}^{-(i)} \right) \propto \exp\left( \left(\tau_i + \sum_{j,j\not=i} \omega_{ij} x_j \right) x_i\right) \]

  • This is a multiple logistic regression model!
  • The most common model to predict the value of one binary variable (dependent variable) given a set of other variables (independent variables)
  • The Ising model is a combination of predictive models. Edges represent how strongly one node predicts another
    • A path means that the predictive strength of one node on another is mediated

Combination of Logistic models

plot of chunk unnamed-chunk-50

Ising Model as Combination of Logistic models

plot of chunk unnamed-chunk-51

\[ \Pr\left( X_1 = 1 \right) \propto \exp\left(\tau_1 + \omega_{12} x_2 + \omega_{14} x_4\right) \]

Ising Model as Combination of Logistic models

plot of chunk unnamed-chunk-52

\[ \Pr\left( X_2 = 1 \right) \propto \exp\left(\tau_2 + \omega_{12} x_1 + \omega_{23} x_3\right) \]

Ising Model as Combination of Logistic models

plot of chunk unnamed-chunk-53

\[ \Pr\left( X_3 = 1 \right) \propto \exp\left(\tau_3 + \omega_{23} x_2 + \omega_{34} x_4\right) \]

Ising Model as Combination of Logistic models

plot of chunk unnamed-chunk-54

\[ \Pr\left( X_4 = 1 \right) \propto \exp\left(\tau_4 + \omega_{14} x_1 + \omega_{34} x_3\right) \]

Continous Data

If \(\pmb{x}\) is not binary but assumed Gaussian we can use a multivariate Gaussian distribution: \[ f\left( \pmb{X} = \pmb{x} \right) = \frac{1}{\sqrt{(2\pi)^k|\boldsymbol\Sigma|}} \exp\left(-\frac{1}{2}({\boldsymbol x}-{\boldsymbol\mu})^{\top}{\boldsymbol\Sigma}^{-1}({\boldsymbol x}-{\boldsymbol\mu}) \right) \]

  • \(\pmb{\mu}\) is a vector that encodes the means
  • \(\boldsymbol{\Sigma}\) is the variance-covariance matrix
  • Now we can rearrange:
    • \(f\left( \pmb{X} = \pmb{x} \right) \propto \exp\left(-\frac{1}{2}({\boldsymbol x}-{\boldsymbol\mu})^{\top}{\boldsymbol\Sigma}^{-1}({\boldsymbol x}-{\boldsymbol\mu})\right)\)
    • \(f\left( \pmb{X} = \pmb{x} \right) \propto \exp\left(-\frac{1}{2}( {\boldsymbol x}^\top {\boldsymbol\Sigma}^{-1} {\boldsymbol x} - {\boldsymbol x}^\top {\boldsymbol\Sigma}^{-1} {\boldsymbol \mu} - {\boldsymbol \mu}^\top {\boldsymbol\Sigma}^{-1}{\boldsymbol x} + {\boldsymbol \mu}^\top{\boldsymbol\Sigma}^{-1}{\boldsymbol \mu})\right)\)
    • \(f\left( \pmb{X} = \pmb{x} \right) \propto \exp\left( {\boldsymbol \mu}^\top {\boldsymbol\Sigma}^{-1}{\boldsymbol x} -\frac{1}{2} {\boldsymbol x}^\top {\boldsymbol\Sigma}^{-1} {\boldsymbol x}\right)\)

Gaussian Random Field

Reparameterizing \(\pmb{\tau} = {\boldsymbol \mu}^\top {\boldsymbol\Sigma}^{-1}\) and \(\pmb{\Omega} = -\boldsymbol{\Sigma}^{-1}\) we obtain the following expression for the Multivariate Normal Distribution:

\[ f\left(\pmb{x}\right) \propto \exp\left( \pmb{\tau}^\top \pmb{x} + \frac{1}{2} \pmb{x}^\top\pmb{\Omega}\pmb{x} \right) \]

  • Exactly the same form as the Ising Model!!!
  • Except:
    • \(\pmb{x}\) is now continuous
    • The normalizing constant is different
  • The multivariate normal distribution encodes a network
    • This network is called a Gaussian Random Field (GRF), Concentration Graph, Gaussian Graphical Model or Partial Correlation Network.

Gaussian Random Field

  • The negative inverse covariance matrix of the multivariate normal distribution, also called precision matrix encodes a network
  • Through mathematical magic, standardized elements of the negative inverse precision matrix are equal partial correlation coefficients between nodes conditioned on all other nodes
    • \(\rho_{ij} = \omega_{ij} / \sqrt{\omega_{ii}\omega_{jj}} = \mathrm{Cor}\left( X_i, X_j \mid \pmb{X}^{-(i,j)} = \pmb{x}^{-(i,j)} \right)\)
  • Typically these partial correlation coefficients are used in the weights matrix of the corresponding network structure
    • There is no edge between nodes \(i\) and \(j\) if \(\rho_{ij} = 0\)
    • Which clearly corresponds to the Markov property that two nodes are then independent conditioned on all other nodes in the network

Conditional Distribution for the Gaussian Random Field

If we observe the value of every node except node \(i\) we obtain the following (assuming all variables are standardized):

\[ \begin{aligned} X_i \mid \pmb{X}^{-(i)} \sim N\left( \sum_{j\not=i} \frac{\omega_{ij}}{\theta^2_i} x_j, \theta^2_{i} \right) \end{aligned} \]

In which \(\theta^2_{i}\) is the \(i\)th diagonal element of the precision matrix, also called the residual variance.

  • This is a multiple linear regression model!
  • Similar to the Ising model, the Gaussian Random Field shows how well nodes predict each other!

Combination of Linear models

plot of chunk unnamed-chunk-55

Ising Model as Combination of Linear models

plot of chunk unnamed-chunk-56

\[ X_1 = \tau_1 + \omega_{12} \theta^{-2}_1 x_2 + \omega_{14} \theta^{-2}_1 x_4 + \varepsilon_1 \]

Ising Model as Combination of Linear models

plot of chunk unnamed-chunk-57

\[ X_2 = \tau_2 + \omega_{12} \theta^{-2}_2 x_2 + \omega_{23} \theta^{-2}_2 x_3 + \varepsilon_2 \]

Ising Model as Combination of Linear models

plot of chunk unnamed-chunk-58

\[ X_3 = \tau_3 + \omega_{23} \theta^{-2}_3 x_3 + \omega_{34} \theta^{-2}_3 x_4 + \varepsilon_3 \]

Ising Model as Combination of Linear models

plot of chunk unnamed-chunk-59

\[ X_4 = \tau_4 + \omega_{34} \theta^{-2}_4 x_3 + \omega_{14} \theta^{-2}_4 x_1 + \varepsilon_4 \]

Markov Random Fields

  • Ising Model
    • All nodes are binary
      • \(-1\) or \(1\)
    • Combination of multiple logistic regression models
  • Gaussian Random Field
    • All nodes assumed normally distributed
    • Graph structure directly related to the inverse variance-covariance matrix
    • Graph usually standardized to partial correlations
    • Combination of multiple linear regression models

Estimating Markov Random Fields

Constructing Markov Random Fields

  • Because the Ising Model is a combination of multiple logistic models and the Gaussian Random field a combination of multiple linear models we can form MRF's in the following way:

    1. Predict each node from all other nodes in a multiple (logistic) regression
    2. This gives you two parameters for each edge in the network. For example:
      • The regression parameter from predicting \(A\) from \(B\) and the regression parameter from \(B\) to \(A\).
      • In logistic regression, these should be about equal. In linear regression, they can be standardized to be about equal
    3. Take the mean of these two parameters as edge weight
      • If the regression parameter is \(0\) there is no edge!

Advanced MRF estimation

  • The methods on the previous slide work but require a lot of data!
  • To obtain a simpler interpretable and more stable network we will apply regularization
    • LASSO
  • Also, the assumption of multivariate normal data can be relaxed in MRF's
  • This will both be discussed on Thursday!

Examples

Radicalization

plot of chunk unnamed-chunk-60

Radicalization

Quality of Life

By Jolanda Kossakowski

Quality of Life

By Jolanda Kossakowski

Quality of Life

By Jolanda Kossakowski

Quality of Life

By Jolanda Kossakowski

Quality of Life

By Jolanda Kossakowski

Quality of Life

By Jolanda Kossakowski

Quality of Life

By Jolanda Kossakowski

Thursday:

  • Estimating sparse networks
  • Normality assumption
  • Another example
  • Cookbook
    • Most codes of the course!
  • Be there!