GNN-Notes-LEC-10

Heterogeneous Graph and Knowledge Graph

1. Heterogeneous Graphs

Definition

G = (V, E, R, T)

V : set of nodes, which have their types in set T

E : set of edges, which have their relations type in set R

R : set of relations

T : set of nodes’ type

2. Relation GCN

Formulation

$$
h_v^{(l+1)} = \sigma(\sum_{r \in R}\sum_{u\in N_v^r}\frac1{c_{v,r}} W_r^{(l)}h_u^{(l)}+W_0^{(l)}h_v^{(l)})
$$

Scalability

Block Diagonal Matrices

use B low-dim matrices to present $W_r^{(l)}$, which reduce parameters from $d^{(l)}\times d^{(l+1)}$ to $B\times \frac{d^{(l+1)}}B \times \frac{d^{(l)}}B$

Basis Learning

regard $W_r^{(l)} = \sum_{b=1}^B a_{rb}^{(l)}\cdot V_b^{(l)}$ , where $V_b^{(l)}$ is shared and each relation need only learn ${a_{rb}^{l}}$

Training

  1. Split edges into training message edges, training supervision edges, validation edges and test edges
  2. use RGCN to score the training supervision edges
  3. create negative edges by perturbing the training supervision edges
  4. use RGCN to score the negative edges
  5. max the score of training supervision edges while min the score of negative edges

Evaluation

  1. calculate the score of test edges
  2. calculate the score of negative edges, which produced by perturbing the test edges and not belong to training message edges and training supervision edges
  3. obtain the ranking RK of test edges
  4. calculate metrics (Higher is Better)
    1. Hits K : 1[RK <= K]
    2. Reciprocal Rank : 1 / RK

4. Knowledge Graph Definition

Nodes labeled with Types —— Entities with Types

Edges between Nodes —— Relations between Entities

5. Knowledge Completion Task

Definition

Given an enormous KG, and (head, relation), predict the missing tails

Key Idea of KG Presentation

  • associate entities and relations with shallow embeddings
  • embeddings of (h,r) should be close to embedding of t

Key Question

  • how to embed (h,r)
  • how to define closeness

Relation Pattern

Symmetric : if r(a,b) then r(b,a)

Antisymmetric : if r(a,b) then $\neg$r(b,a)

inverse : if r1(a,b) then r2(b,a)

composition / transitive : if r1(a,b) and r2(b,c) then r3(a,c)

1-to-N : r(a,b1), r(a,b2), …, r(a,bn) is true

TransE

Intuition

for a triple (h, r, t), we hope $h + r \approx t$ if given fact is true else $h + r \ne t$

Scoring function
$$
f_r(h,t) = - |h+r-t|
$$
Update Formulation
$$
\sum_{[(h,l,t),(h’,l,t’)]\in batchT} \nabla [\gamma + d(h+l,t) - d(h’+l,t’)]\
$$
Algorithm

k is the dim of embeddings

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def TransE(S:set[{h,l,t}],E:set[entities],L:set[link],k:int,):
# Initial
for l in L:
l = uniform(-6/sqrt(k),-6/sqrt(k))
l = l / ||l||
for e in E:
e = uniform(-6/sqrt(k),-6/sqrt(k))
# Loop
while :
for e in E :
e = e / || e ||
batchS = sample(S,batch_size)
batchT = set()
for (h,l,t) in batchS:
h_n,l,t_n = neg_sample(S(h,l,t))
batchT.add([(h,l,t),(h_n,l,t_n)])
# update
update with formulation

expression power

could express

  • antisymmetric
  • inverse
  • composition

but can’t express

  • symmetric
  • 1-to-N

TransR

Intuition

Entities in the Entity Space $\mathbb R^{d}$

Relations in the Relation Space $\mathbb R^k$

Projection matrix $M\in R^{d\times k}$ project entities to the Relation Space

For Entity h and t, we hope the projections of them in the Relation Space satisfy $h_\bot + l \approx t_\bot$

Scoring Function
$$
f_r(h,t) = -|h_\bot +l-t_\bot|
$$
Expression Power

could express:

  • antisymmetric
  • inverse
  • symmetric
  • 1-to-N

can’t express

  • composition / transition

Bilinear Modeling : DistMult

Intuition

min the cosine similarity between h * r and t

Scoring Function
$$
f_r(h,t) = \sum_i h_i \cdot l_i \cdot t_i
$$
Expression Power

could express

  • symmetric
  • 1-to-N

can’t express

  • antisymmetric
  • inverse
  • composition

ComplEx

intuition

embedding entities and relations in Complex vector space

Scoring Function
$$
f_r(h,t) = Re(\sum_i h_i \cdot r_i \cdot \bar t_i)\
t_i = a + b i \Rightarrow \bar t_i = a - b i
$$
Expression Power

could express

  • symmetric
  • antisymmetric
  • inverse
  • 1-to-N

can’t express

  • composition