GNN学习笔记

GNN从入门到精通课程笔记

2.4 Struc2Vec (KDD ‘17)

struc2vec: Learning Node Representations from Structural Identity

Key ideas

Structural similarity does not depend on hop distance
- neighbor nodes can be different, far away nodes can be similar
Structural identity as a hierarchical concept
- depth of similarity varies
Flexible four step procedure
- The operational aspect of steps is flexible

Procedure

Step 1: Structural Similarity
- 设计了一种衡量两个节点Structural Similarity的方法，这个方法只以来与节点（及其邻居）的度有关
- Hierarchical measure for structural similarity between two nodes
- R_k(u): set of nodes at distance k from u (ring)
- s(S): ordered degree sequence of set S
- g(D1 ,D2): distance between two ordered sequences
  - cost of pairwise alignment: max(a,b) / min(a,b) -1
  - optimal alignment by DTW
- f_k(u,v): structural distance between nodes u and v considering first k rings
  - f_k(u,v) = f_{k-1}(u,v) + g(s(R_k(u)), s(R_k(v)))
  - g(s(R_k(u)), s(R_k(v))): 在第k个Ring上，u和v的距离
  - f_k(u,v): 在第0-k个Ring上，u和v的距离的和
Step 2: Multi-layer Graph
- Each layer is weighted complete graph
  - corresponds to similarity hierarchies
- Edge weights in layer k
  - w_k (u,v) = exp{-f_k(u,v)}
  - 把在第k个Ring上u和v的距离映射到Layer k上
- Connect corresponding nodes in adjacent layers
  - Each vertex is connected to its corresponding vertex in the layer above and below (layer permitting).
  - The edge weight between layers
    - w(u_k,u_{k+1}) = log(\Gamma_k(u) + e)
      - \Gamma_k(u) = \Sigma 1{w_k(u,v)>\bar{w_k}}
      - \Gamma_k(u): The similarity of node u to other nodes in layer k.
      - If u has many similar nodes in the current layer, then it should change layers to obtain a more refined context.
      - By moving up one layer, the number of similar nodes can only decrease.
    - w(u_k,u_{k-1}) = 1
Step 3: Generate Context
- Context generated by biased random walk
  - walking on multi-layer graph
- Walk in current layer with probability p
  - choose neighbor according to edge weight
  - RW prefers more similar nodes
- Change layer with probability 1-p
  - choose up/down according to edge weight
  - RW prefer layer with less similar neighbors
Step 4: Learn Representation
- For each node, generate set of independent and relative short random walks
- Train a neural network to learn latent representation for nodes
  - Skip-gram
  - Hierarchical Softmax

Conclusion:

struc2vec helps classification based on roles