GraphSAGE: Inductive Representation Learning on Large Graphs
GraphSAGE研究意義:
1. 圖卷積神經(jīng)網(wǎng)絡(luò)最常用的幾個模型(GCN、GAT、GraphSAGE)
2、歸納式學習(inductive learning)
3、不同于之前的學習node embedding,提出學習aggregators等函數(shù)的方式
4、探討了多種aggregator方式(mean、pool、lstm)
5、圖表征學習的經(jīng)典baseline
?
論文主要結(jié)構(gòu):
一、摘要Abstract
介紹圖的廣泛應(yīng)用,主要引出本文的motivations是做圖的歸納式學習,通過學習一組函數(shù)對節(jié)點的鄰居采樣,然后匯聚得到向量式表達,具體可以總結(jié)為以下幾點:
? ? ? ?1、提出一種歸納式學習模型,可以得到新點/新圖的表征
2、GraphSAGE模型通過學習一組函數(shù)來得到點的特征
3、采樣并匯聚點的鄰居特征與節(jié)點的特征拼接得到點的特征
4、GraphSAGE算法在直推式和歸納式學習均達到最優(yōu)結(jié)果
二、Introduction
? ? ? ?介紹了圖的廣泛應(yīng)用,介紹之前的工作主要是基于靜態(tài)圖的算法,GraphSAGE處理新點甚至新圖,總結(jié)了DeepWalk、Node2vec、GCN等算法,提出本文算法主要是訓練aggregate函數(shù)
三、Related Work
? ? ? ?介紹之前的算法,基于隨機游走、矩陣分解、圖卷積等算法
四、GraphSAGE模型
? ? ? ?主要介紹前向傳播算法、模型參數(shù)介紹、aggregator模型結(jié)構(gòu)
? ??
?
GraphSAGE算法如上圖Algorithm1,主要的部分就是歸納也就是(4)、(5)兩部分,所有鄰居信息匯聚,以及自身信息和鄰居信息合并計算
?
接著,文章又介紹了目標函數(shù)(如上圖3.2),不僅可以進行有監(jiān)督學習,還可以進行無監(jiān)督學習,無監(jiān)督學習的目標函數(shù)和之前的圖算法目標函數(shù)一致,說的就是圖結(jié)構(gòu)中,兩個節(jié)點關(guān)系比較緊密,那么學出來的兩個節(jié)點的embedding也比較相似
之后介紹了aggregate函數(shù)的幾種方式,包括Mean、LSTM、Pooling,論文附錄中還給出批量學習的算法
五、Experiments
? ? ? ?實驗設(shè)置、數(shù)據(jù)集選擇、直推式學習實驗、參數(shù)分析、不同aggregate函數(shù)對模型的影響分析
?
?
主要介紹了一些實驗參數(shù)以及對·實驗數(shù)據(jù)集的介紹,最后實驗結(jié)果對比
六、Theoretical Analysis && Conclusion
? ? ? ? 總結(jié)提出的GraphSAGE模型具有歸納式的能力,鄰居匯聚時考慮不同的aggregator方式,討論了幾種未來方向和subgraph embedding 鄰居采樣方式等
? ? ? ? 創(chuàng)新點:?
1、歸納式學習(inductive learning)
2、多種aggregators探討
3、文中并給出一些理論分析
關(guān)鍵點:
1、模型結(jié)構(gòu)
2、鄰居節(jié)點的sampling
3、Batch訓練方式
啟發(fā)點:
1、歸納式學習方式
2、多種aggregate函數(shù)討論
3、Batch 訓練方式 sample 鄰居性能高效
4、GCN、GAT、GraphSAGE經(jīng)典的baselines
七、Coding
論文中的數(shù)據(jù)集-cora數(shù)據(jù)集主要包含兩個文件,一個是cora.cites表示兩個節(jié)點節(jié)點是否有邊另一個是cora.content 表示每個節(jié)點的特征以及l(fā)abelexample:cora.cites35 1033 35 103482 35 103515 35 1050679 35 1103960 35 1103985 35 1109199 35 1112911 35 1113438 35 1113831 35 1114331 35 1117476 35 1119505 35 1119708 35 1120431 35 1123756 35 1125386 35 1127430 35 1127913 .....cora.content31336 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 Neural_Networks ......?
""" 加載數(shù)據(jù)并對數(shù)據(jù)進行處理 """def load_cora():import numpy as npnum_nodes = 2708num_feats = 1433feat_data = np.zeros((num_nodes, num_feats))labels = np.empty((num_nodes, 1), dtype=np.int64)node_map = {}label_map = {}with open('../cora/cora.content') as fp:for i,line in enumerate(fp):info = line.strip().split()tmp = []for ss in info[1:-1]:tmp.append(float(ss))feat_data[i,:] = tmpnode_map[info[0]] = iif not info[-1] in label_map:label_map[info[-1]] = len(label_map)labels[i] = label_map[info[-1]]from collections import defaultdictadj_lists = defaultdict(set)with open('../cora/cora.cites') as fp:for i,line in enumerate(fp):info = line.strip().split()uid = node_map[info[0]]target_uid = node_map[info[1]]adj_lists[uid].add(target_uid)adj_lists[target_uid].add(uid)return feat_data,labels,adj_lists""" 構(gòu)建aggregate 函數(shù)"""import torch import torch.nn as nn from torch.autograd import Variable import randomclass MeanAggregator(nn.Module):def __init__(self,features,cuda=False,gcn=False):super(MeanAggregator,self).__init__()self.features = featuresself.cuda = cudaself.gcn = gcndef forward(self,nodes,to_neighs,num_sample=10):_set = setif not num_sample is None:_sample = random.samplesamp_neighs = [_set(_sample(to_neigh, num_sample)) if len(to_neigh) >= num_sample else to_neigh for to_neigh in to_neighs]else:sample_neighs = to_neighsif self.gcn:sample_neighs = [samp_neigh + set([nodes[i]]) for i,samp_neigh in enumerate(samp_neighs)]unique_nodes_list = list(set.union(*samp_neighs))unique_nodes = {n:i for i,n in enumerate(unique_nodes_list)}mask = Variable(torch.zeros(len(samp_neighs),len(unique_nodes)))column_indices = [unique_nodes[n] for samp_neigh in samp_neighs for n in samp_neigh]row_indices = [i for i in range(len(samp_neighs)) for j in range(len(samp_neighs[i]))]mask[row_indices,column_indices] = 1if self.cuda:mask = mask.cuda()num_neigh = mask.sum(1,keepdim=True)mask = mask.div(num_neigh)if self.cuda:embed_matrix = self.features(torch.LongTensor(unique_nodes_list).cuda())else:embed_matrix = self.features(torch.LongTensor(unique_nodes_list))to_feats = mask.mm(embed_matrix)return to_feats""" 自身節(jié)點和鄰居節(jié)點進行聚合 """import torch import torch.nn as nn from torch.nn import init import torch.nn.functional as Fclass Encoder(nn.Module):"""Encodes a node's using 'convolutional' GraphSage approach"""def __init__(self, features, feature_dim, embed_dim, adj_lists, aggregator,num_sample=10,base_model=None, gcn=False, cuda=False, feature_transform=False): super(Encoder, self).__init__()self.features = features# 變換前的hidden_size/維度self.feat_dim = feature_dimself.adj_lists = adj_lists# 即鄰居聚合后的mebeddingself.aggregator = aggregatorself.num_sample = num_sampleif base_model != None:self.base_model = base_modelself.gcn = gcn# 變換后的hidden_size/維度self.embed_dim = embed_dimself.cuda = cudaself.aggregator.cuda = cuda# 矩陣W維度 = 變換后維度 * 變換前維度# 其中g(shù)cn表示是否拼接,如果拼接的話由于是"自身向量||鄰居聚合向量", 所以維度為2倍self.weight = nn.Parameter(torch.FloatTensor(embed_dim, self.feat_dim if self.gcn else 2 * self.feat_dim))init.xavier_uniform(self.weight)def forward(self, nodes):"""Generates embeddings for a batch of nodes.nodes -- list of nodes"""neigh_feats = self.aggregator.forward(nodes, [self.adj_lists[int(node)] for node in nodes], self.num_sample)if not self.gcn:if self.cuda:self_feats = self.features(torch.LongTensor(nodes).cuda())else:self_feats = self.features(torch.LongTensor(nodes))# 將自身和聚合鄰居的向量拼接, algorithm 1 line 5的拼接部分combined = torch.cat([self_feats, neigh_feats], dim=1)else:# 只用聚合鄰居的向量來表示,不用自身信息, algorithm 1 line 5的拼接部分combined = neigh_feats# 送入到神經(jīng)網(wǎng)絡(luò),algorithm 1 line 5乘以矩陣Wcombined = F.relu(self.weight.mm(combined.t()))# 經(jīng)過一層GNN layer后的點的embedding,維度為embed_dim * nodesreturn combined""" 定義整體結(jié)構(gòu) """class SupervisedGraphSage(nn.Module):def __init__(self, num_classes, enc):super(SupervisedGraphSage, self).__init__()# 這里面賦值為enc2(經(jīng)過兩層GNN)self.enc = encself.xent = nn.CrossEntropyLoss()# 全連接參數(shù)矩陣,映射到labels num_classes維度做分類self.weight = nn.Parameter(torch.FloatTensor(num_classes, enc.embed_dim))init.xavier_uniform(self.weight)def forward(self, nodes):# embeds實際是我們兩層GNN后的輸出nodes embeddingembeds = self.enc(nodes)# 最后將nodes * hidden size 映射到 nodes * num_classes(= 7)之后做softmax計算cross entropyscores = self.weight.mm(embeds)return scores.t()def loss(self, nodes, labels):# 錢箱傳播scores = self.forward(nodes)# 定義的cross entropyreturn self.xent(scores, labels.squeeze())""" 訓練模型 """def run_cora():# 隨機數(shù)設(shè)置seed(種子)np.random.seed(1)random.seed(1)# cora數(shù)據(jù)集點數(shù)num_nodes = 2708# 加載cora數(shù)據(jù)集, 分別是# feat_data: 特征# labels: 標簽# adj_lists: 鄰接表,dict (key: node, value: neighbors set)feat_data, labels, adj_lists = load_cora()# 設(shè)置輸入的input features矩陣X的維度 = 點的數(shù)量 * 特征維度features = nn.Embedding(2708, 1433)# 為矩陣X賦值,參數(shù)不更新features.weight = nn.Parameter(torch.FloatTensor(feat_data), requires_grad=False)# features.cuda()# 一共兩層GNN layer# 第一層GNN# 以mean的方式聚合鄰居, algorithm 1 line 4agg1 = MeanAggregator(features, cuda=True)# 將自身和聚合鄰居的向量拼接后送入到神經(jīng)網(wǎng)絡(luò)(可選是否只用聚合鄰居的信息來表示), algorithm 1 line 5enc1 = Encoder(features, 1433, 128, adj_lists, agg1, gcn=True, cuda=False)# 第二層GNN# 將第一層的GNN輸出作為輸入傳進去# 這里面.t()表示轉(zhuǎn)置,是因為Encoder class的輸出維度為embed_dim * nodesagg2 = MeanAggregator(lambda nodes : enc1(nodes).t(), cuda=False)# enc1.embed_dim = 128, 變換后的維度還是128enc2 = Encoder(lambda nodes : enc1(nodes).t(), enc1.embed_dim, 128, adj_lists, agg2,base_model=enc1, gcn=True, cuda=False)# 采樣的鄰居點的數(shù)量enc1.num_samples = 5enc2.num_samples = 5# 7分類問題# enc2是經(jīng)過兩層GNN layer時候得到的 node embedding/featuresgraphsage = SupervisedGraphSage(7, enc2)# graphsage.cuda()# 目的是打亂節(jié)點順序rand_indices = np.random.permutation(num_nodes)# 劃分測試集、驗證集、訓練集test = rand_indices[:1000]val = rand_indices[1000:1500]train = list(rand_indices[1500:])# 用SGD的優(yōu)化,設(shè)置學習率optimizer = torch.optim.SGD(filter(lambda p : p.requires_grad, graphsage.parameters()), lr=0.7)# 記錄每個batch訓練時間times = []# 共訓練100個batchfor batch in range(100):# 取256個nodes作為一個batchbatch_nodes = train[:256]# 打亂訓練集的順序,使下次迭代batch隨機random.shuffle(train)# 記錄開始時間start_time = time.time()optimizer.zero_grad()# 這個是SupervisedGraphSage里面定義的cross entropy lossloss = graphsage.loss(batch_nodes, Variable(torch.LongTensor(labels[np.array(batch_nodes)])))# 反向傳播和更新參數(shù)loss.backward()optimizer.step()# 記錄結(jié)束時間end_time = time.time()times.append(end_time-start_time)# print (batch, loss.data[0])print (batch, loss.data)# 做validationval_output = graphsage.forward(val)# 計算micro F1 scoreprint ("Validation F1:", f1_score(labels[val], val_output.data.numpy().argmax(axis=1), average="micro"))# 計算每個batch的平均訓練時間print ("Average batch time:", np.mean(times)) """ 模型運行結(jié)果 """run_cora()0 tensor(1.9649) 1 tensor(1.9406) 2 tensor(1.9115) 3 tensor(1.8925) 4 tensor(1.8731) 5 tensor(1.8354) 6 tensor(1.8018) 7 tensor(1.7535) 8 tensor(1.6938) 9 tensor(1.6029) 10 tensor(1.6312) 11 tensor(1.5248) 12 tensor(1.4800) 13 tensor(1.4503) 14 tensor(1.4162) 15 tensor(1.3210) 16 tensor(1.2243) 17 tensor(1.2255) 18 tensor(1.0978) 19 tensor(1.1330) 20 tensor(0.9534) 21 tensor(0.9112) 22 tensor(0.9170) 23 tensor(0.7924) 24 tensor(0.8008) 25 tensor(0.7142) 26 tensor(0.7839) 27 tensor(0.8878) 28 tensor(1.2177) 29 tensor(0.9943) 30 tensor(0.8073) 31 tensor(0.6588) 32 tensor(0.6254) 33 tensor(0.5622) 34 tensor(0.5158) 35 tensor(0.4763) 36 tensor(0.5298) 37 tensor(0.5419) 38 tensor(0.5098) 39 tensor(0.4122) 40 tensor(0.4262) 41 tensor(0.4451) 42 tensor(0.4126) 43 tensor(0.4409) 44 tensor(0.3913) 45 tensor(0.4496) 46 tensor(0.4365) 47 tensor(0.4601) 48 tensor(0.4714) 49 tensor(0.4090) 50 tensor(0.4145) 51 tensor(0.3428) 52 tensor(0.3454) 53 tensor(0.3531) 54 tensor(0.3131) 55 tensor(0.2719) 56 tensor(0.3519) 57 tensor(0.3286) 58 tensor(0.3125) 59 tensor(0.2529) 60 tensor(0.3033) 61 tensor(0.2332) 62 tensor(0.3049) 63 tensor(0.3026) 64 tensor(0.3770) 65 tensor(0.3811) 66 tensor(0.3223) 67 tensor(0.2450) 68 tensor(0.2620) 69 tensor(0.2846) 70 tensor(0.2482) 71 tensor(0.3044) 72 tensor(0.4133) 73 tensor(0.3156) 74 tensor(0.4421) 75 tensor(0.2596) 76 tensor(0.2585) 77 tensor(0.2639) 78 tensor(0.2035) 79 tensor(0.2328) 80 tensor(0.1748) 81 tensor(0.1730) 82 tensor(0.1978) 83 tensor(0.1614) 84 tensor(0.1890) 85 tensor(0.1227) 86 tensor(0.1568) 87 tensor(0.1527) 88 tensor(0.2365) 89 tensor(0.2297) 90 tensor(0.1787) 91 tensor(0.1920) 92 tensor(0.1864) 93 tensor(0.1254) 94 tensor(0.1678) 95 tensor(0.1336) 96 tensor(0.1562) 97 tensor(0.2531) 98 tensor(0.2392) 99 tensor(0.2089) Validation F1: 0.864 Average batch time: 0.047979302406311035?
總結(jié)
以上是生活随笔為你收集整理的GraphSAGE: Inductive Representation Learning on Large Graphs的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: node2vec: Scalable F
- 下一篇: metapath2vec: Scalab