Lec 02 Traditional feature-base method for GNN
1. Node-Level Feature
Node Degree
category : importance-based, structure-based
weakness : treat all neighboring nodes equally
Node Centrality
category : importance-based
core : take the node importance in a graph
eigenvector centrality
core : v is important if its neighboring nodes are important
formula :
$$
\begin{align}
&c_v = \frac{1}{\lambda} \sum_{u\in N(v)} c_u\
\Leftrightarrow&\lambda c = A c
\end{align}
$$
bewteeness centrality
core : v is important if v lies on many shortest paths between two nodes
formula :
use $p(u,v)$ express the number of shortest path between $u$ and $v$
use $p(v,s,t)$ express the number of shortest path between $s$ and $t$ via $v$
$$
c_v = \sum_{v\ne s\ne t}\frac{p(s,t)}{p(s,t,v)}
$$
closeness centrality
core : v is important if v has small shortest path lengths to all other nodes
formula :
use $p(u,v)$ express the shortest path length between node $u$ and $v$
$$
c_v = \frac 1 {\sum_{u\ne v}p(u,v)}
$$
Clustering coefficient
category : structure-based
formula :
use $e(N(v))$ express the number of edge among $N(v)$
$$
c_v = \frac{e(N(v))}{C_{k_v}^2}
$$
Graphlet Degree Vector
category : structure-based
core : use the number of graphlet with node $v$ express $v$’s feature
2. Link-Level Feature
Task of Link Prediction
Given : existing links
Aim : predict new links
Key : design features to express a pair of nodes
Formulations of link predictions
links missing at ramdon
core : self-supervised
methodology :
- remove links randomly
- predict them
links over time
core : self-regression
Given : $G[t_0,t_0’]$ express graph $G$ at time $t_0’$
Output : $L$ express a list of links that be predicted to appear at time $t_1’$
Evaluation : take top $n$ links in $L$ and then count how many links in it appear at time $t_1’$
Methodology :
- for each $(x,y)$, compute its score $c(x,y)$
- decreasing score $c(x,y)$
- select top $n$ as predicted links
Distance-based Feature
mainly include : shortest path distance between two nodes
use $s(u,v)$ express the length of shortest path between node $u$ and node $v$, and use $s(u,v)$ as the feature of $(u,v)$
weakness : ignore the neighborhood’s degree
Local Neighborhood Overlap
common neighborhood
$|N(v_1)\cap N(v_2)|$
Jaccard’s coefficient
$\frac{|N(v_1)\cap N(v_2)|}{|N(v_1)\cup N(v_2)|}$
Adamic-Adar index
encoding the node importance also
$\sum_{u\in N(v_1)\cap N(v_2)} \frac 1{\log(k_u)}$
Limitation : metric will be 0 if two nodes has no common neighbors but they may connected in the furture
Global Neighborhood Overlap
core : use the number of paths of all lengths between given nodes
calculation : calculate by the power of Adjacency Matrix
Katz index’s formulation :
$$
s_{u,v} = \sum_{l=1}^{\infty} \beta^l A^l\
S = (I-\beta A)^{-1}-I\
where\ 0< \beta < 1
$$
3. Graph-Level Feature
Kernel Methods
Idea : design kernel instead of feature vector
Introduction :
$K(G,G’)$ measures the similarity between $G$ and $G’$
Kernel metrix $k=(K(G,G’))_{G,G’}$ must be positive semidefinite
exists a feature represent $\phi(\cdot)$ such that $K(G,G’) = \phi(G) \phi(G’)$
Core Idea of Graph Kernels
Bags-of-words is the main idea use in graph feature
Graphlet Features
Main Idea : Bags of Graphlets
Given : graph $G$, graphlet list $G_k = (g_1,g_2,…,g_{n_k})$, graphlet count vector $f_G \in R^{n_k}$, where $(f_G)_i = #(g_i \in G)$
Output : $K(G,G’) = f_G^T f_{G^{‘}}$
Problem : if $G$ and $G’$ has different sizes, that woudl greatly skew value
Solutions : normalization by $h_G = \frac{f_G}{sum(f_G)}$ and $K(G,G’) = h_G^T h_{G^{‘}}$
Limitations : NP-hard problem, time expensive
Weisfeiler-Lehman Kernel
Idea :use neighborhood structure to iteratively enrich node vocabulary, or bags of colors
Methodology :
Given graph G with node set V
Init : $c^0(v)$
for K steps :
$c^{(k+1)}(v) = hash({c^{(k)}(v),{c^{(k)}(u)}_{u\in N(v)}})$
count numbers of nodes with given color
Time Complexity : $K |E|$