Lec 06 Deep Learning for Graphs
1. Problem Definition
Given graph G :
$V$ : vertex set
$A$ : adjacency matrix
$X \in \mathbb R^{m\times|V|}$ : node features matrixs
$v$ : a node in vertex set
$N(v)$ : the neighbors of vertex $v$
2. Neighborhood Aggregation
Formulation of Scalar
$$
h_v^0 = x_v\
h_v^{(l+1)} = \sigma(W_l\sum_{u\in N(v)}\frac{h_u^{(l)}}{|N(v)|}+B_lh_v^{(l)}),\forall l\in{0,1,…}\
z_v = h_v^{(L)}
$$
Formulation of Matrix
$$
H^{(l)} = [h_1^{(l)},h_2^{(l)},…,h_{|V|}^{(l)}]\
\sum_{u\in N(v)} h_u^{(l)} = A \cdot H^{(l)}\
D\in R^{|V|\times |V|},\ where\ D_{v,v}=|N(v)|\
H^{(l+1)} = \sigma(W_lD^{-1}AH^{(l)}+H^{(l)}B_l^T)
$$
Advantage : on the fly
3. How to train GNN?
Supervised setting
$$
\min_{\Theta} \mathcal L(y,f(z_v))
$$
Unsupervised setting
Idea : use the graph structure as the supervision
$$
\mathcal L = \sum_{z_u,z_v}CE(y_{u,v},DEC(z_u,z_v))
$$
4. GraphSAGE / GCN
Ideas of GraphSAGE
- flexible aggregation function and concatenation
$$
h_v^{(l+1)} = \sigma([W_l\cdot AGG({h_u^{(l)},\forall u \in N(v)}),B_lh_v^{(l)}])
$$
- $l_2$ normalization
$$
h_v^k \leftarrow \frac{h_v^k}{|h_v^k|_2}
$$
All kinds of AGG
Mean
$$
AGG({h_u^{(l)},\forall u \in N(v)}) = \sum_{u\in N(v)}\frac{h_u^{(l)}}{|N(v)|}
$$Pool
$$
AGG({h_u^{(l)},\forall u \in N(v)}) = \gamma({MLP(h_u^{(l)}),\forall u \in N(v)})
$$
$\gamma$ is element-wise mean or maxLSTM
$$
AGG({h_u^{(l)},\forall u \in N(v)}) = LSTM([h_u^{(l)},\forall u \in (N(v))])
$$
In fact, the AGG is the convolution of graph