GNN-Notes-LEC-05

Lec 05 Collective Classification

Question : Given a graph with labels on some nodes, how do we assign labels to all other nodes in the network ?

Intuition : Similar nodes are connected

Technologies : Relational classification, Iterative Classification, Belief Propagation

Explain of “Similar nodes are connected” from the view of social science:

Homophily : the tendency of individuals to associate and bond with similar others
Influence : social connection can influence the individual characteristics of a person

similar nodes are typically close together or directly connected in the network
classification label of node $v$ may depend on 1) features of $v$; 2) labels of the nodes in $v$’s neighborhood; 3) features of the nodes in $v$’s neighborhood

Given :

adjacency matrix $A$ over $n$ nodes

some nodes with labels $Y_i$ and some unlabeled nodes

the features of nodes $F$

Goal :

Predict the labels of unlabeled nodes, or find $P(Y_v)$ given features and network

Hypothesis : Markov Assumption

Pipeline :

Local classifier : used for initial label assignment
- Predict the nodes label used only their features, but not network information
Relational Classifier : capture correlation
- label one node based on the labels or features of its neighbors
Collective Inference
- apply the relational classifier to each node iteratively until the inconsistency between neighboring labels is minimized
- be affected by network structure

Base Idea : class probability of $Y_v$ of node $v$ is a weighted average of class probabilities of its neighbors

Steps :

Initial : initial labeled node $v$ with its ground-truth label $Y_v^*$ while initial unlabeled nodes $Y_v = 0.5$
Update all nodes’ labels with formula follow until convergence:
$$
P(Y_v = c) = \frac 1 {\sum_{(v,u)\in E}A_{u,v}}\sum_{(v,u)\in E}A_{v,u} P(Y_u=c)
$$
where $A$ is adjacency matrix with weight or strength information

Challenges :

Convergence is not guaranteed

model can’t use node feature information

Main Idea : classify node $v$ with its attributions $f_v$ and the summary of labels $z_v$ of neighbors set $N_v$

Approach :

$\phi_1(f_v)$ : predict the label of node $v$ base of $f_v$
$\phi_2(f_v,z_v)$ : predict the label base of $f_v$ and $z_v$
the approaches of $z_v$
- histogram of the number of each label in $N_v$
- most common label in $N_v$
- number of different labels in $N_v$

Steps :

classify based on node attributions alone
- train classifier $\phi_1,\phi_2$ with training set
Iterate till convergence
- Initial : initial the labels with $\phi_1$
- Iterate : update $z_v$ based on $Y_u,\forall u \in N_v$ and then update $Y_v$ based on $f_v$ and $z_v$

Challenges :

Convergence is not guaranteed

Notation

Label-label potential matrix $\psi$ : $\psi(Y_i,Y_j)$ is proportional to the probability of a node $j$ being in class $Y_j$ given that it has neighbor $i$ in $Y_i$
Prior belief $\phi$ : $\phi(Y_i)$ is proportional to the probability of node $i$ being in class $Y_i$
$m_{i\rightarrow j}(Y_j)$ : $i$’s message of $j$ being in $Y_j$
$\mathcal L$ : set of all classes / labels

Steps

Initial all messages to 1
repeat for each node by formula
$$
m_{i\rightarrow j}(Y_j) = \sum_{Y_i\in \mathcal L} \psi(Y_i,Y_j) \phi(Y_i) \Pi_{k\in N_i/j} m_{k\rightarrow i}(Y_i),\forall Y_j \in \mathcal L
$$

Result
$$
b_i(Y_i) = \phi_i(Y_i)\Pi_{j \in N_i}m_{j\rightarrow i}(Y_i),\forall Y_i \in \mathcal L
$$
Advantages

Challenges