GNN从入门到精通课程笔记

2.3 Node2Vec (Code-Application)

node2vec: Scalable Feature Learning for Networks (KDD ‘16)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import networkx as nx
import numpy as np
import random

G = nx.les_miserables_graph()
print("Nodes: ", G.nodes())
print("Number of nodes: ", G.number_of_nodes())

from node2vec import Node2Vec

node2vec = Node2Vec(G, dimensions=64, walk_length=30, num_walks=200, workers=4)

# DFS
p = 1.0
q = 0.5
n_cluster = 6
# BFS
# p = 1.0
# q = 2.0
# n_cluster = 3

model = node2vec.fit(window=3, min_count=1, batch_words=4)
embedding = model.wv.vectors
print("Node Embedding shape: ", embedding.shape)

from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=n_cluster, random_state=0).fit(embedding)
print("Labels: ", kmeans.labels_)

colors = []
nodes = list(G.nodes())
for node in nodes:
idx = model.wv.key_to_index[str(node)]
colors.append(kmeans.labels_[idx])

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
position = nx.spring_layout(G)
nx.draw(G, pos=position, node_color=colors, with_labels=True)
plt.savefig("les_miserables.png")

# edge embedding
from node2vec.edges import HadamardEmbedder

edges_embs = HadamardEmbedder(keyed_vectors=model.wv)
edges_kv = edges_embs.as_keyed_vectors()
print("Edge embedding shape: ", edges_kv.vectors.shape)

运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
Nodes:  ['Napoleon', 'Myriel', 'MlleBaptistine', 'MmeMagloire', 'CountessDeLo', 'Geborand', 'Champtercier', 'Cravatte', 'Count', 'OldMan', 'Valjean', 'Labarre', 'Marguerite', 'MmeDeR', 'Isabeau', 'Gervais', 'Listolier', 'Tholomyes', 'Fameuil', 'Blacheville', 'Favourite', 'Dahlia', 'Zephine', 'Fantine', 'MmeThenardier', 'Thenardier', 'Cosette', 'Javert', 'Fauchelevent', 'Bamatabois', 'Perpetue', 'Simplice', 'Scaufflaire', 'Woman1', 'Judge', 'Champmathieu', 'Brevet', 'Chenildieu', 'Cochepaille', 'Pontmercy', 'Boulatruelle', 'Eponine', 'Anzelma', 'Woman2', 'MotherInnocent', 'Gribier', 'MmeBurgon', 'Jondrette', 'Gavroche', 'Gillenormand', 'Magnon', 'MlleGillenormand', 'MmePontmercy', 'MlleVaubois', 'LtGillenormand', 'Marius', 'BaronessT', 'Mabeuf', 'Enjolras', 'Combeferre', 'Prouvaire', 'Feuilly', 'Courfeyrac', 'Bahorel', 'Bossuet', 'Joly', 'Grantaire', 'MotherPlutarch', 'Gueulemer', 'Babet', 'Claquesous', 'Montparnasse', 'Toussaint', 'Child1', 'Child2', 'Brujon', 'MmeHucheloup']
Number of nodes: 77
Computing transition probabilities: 100%|████████████████████████| 77/77 [00:00<00:00, 3824.65it/s]
Generating walks (CPU: 4): 100%|███████████████████████████████████| 50/50 [00:02<00:00, 18.38it/s]
Generating walks (CPU: 3): 100%|███████████████████████████████████| 50/50 [00:02<00:00, 18.27it/s]
Generating walks (CPU: 1): 100%|███████████████████████████████████| 50/50 [00:02<00:00, 17.88it/s]
Generating walks (CPU: 2): 100%|███████████████████████████████████| 50/50 [00:02<00:00, 17.89it/s]
Node Embedding shape: (77, 64)
Labels: [1 1 4 4 1 5 4 4 4 0 2 1 4 5 4 4 1 5 2 5 0 0 1 0 2 0 0 0 0 5 5 1 3 3 4 4 4
3 3 3 5 3 5 0 4 4 4 1 1 5 1 1 4 0 1 1 2 0 4 1 1 1 1 2 4 2 2 2 2 2 1 1 1 1
1 5 1]
Generating edge features: 100%|██████████████████████████| 3003/3003.0 [00:00<00:00, 297913.74it/s]
Edge embedding shape: (3003, 64)