生活随笔
收集整理的這篇文章主要介紹了
Python社区发现—Louvain—networkx和community
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
社區
如果一張圖是對一片區域的描述的話,將這張圖劃分為很多個子圖。當子圖之內滿足關聯性盡可能大,而子圖之間關聯性盡可能低時,這樣的子圖可以稱之為一個社區。
社區發現算法
社區發現算法有很多,例如LPA,HANP,SLPA以及Louvain,不同的算法劃分社區的效果不盡相同。Louvain算法是基于模塊度的社區發現算法,該算法在效率和效果上都表現較好,并且能夠發現層次性的社區結構,其優化目標是最大化整個社區網絡的模塊度。
模塊度
模塊度是評估一個社區網絡劃分好壞的度量方法,它的物理含義是社區內節點的連邊數與隨機情況下的邊數只差,它的取值范圍是 [?1/2,1)。可以簡單地理解為社區內部邊的權重減去所有與社區節點相連的邊的權重和,對無向圖更好理解,即社區內部邊的度數減去社區內節點的總度數。
Louvain算法
算法流程: 1、初始時將每個頂點當作一個社區,社區個數與頂點個數相同。 2、依次將每個頂點與之相鄰頂點合并在一起,計算它們的模塊度增益是否大于0,如果大于0,就將該結點放入該相鄰結點所在社區。 3、迭代第二步,直至算法穩定,即所有頂點所屬社區不再變化。 4、將各個社區所有節點壓縮成為一個結點,社區內點的權重轉化為新結點環的權重,社區間權重轉化為新結點邊的權重。 5、重復步驟1-3,直至算法穩定。
import collections
import random
def load_graph ( path
) : G
= collections
. defaultdict
( dict ) with open ( path
) as text
: for line
in text
: vertices
= line
. strip
( ) . split
( ) v_i
= int ( vertices
[ 0 ] ) v_j
= int ( vertices
[ 1 ] ) w
= float ( vertices
[ 2 ] ) G
[ v_i
] [ v_j
] = wG
[ v_j
] [ v_i
] = w
return G
class Vertex ( ) : def __init__ ( self
, vid
, cid
, nodes
, k_in
= 0 ) : self
. _vid
= vidself
. _cid
= cidself
. _nodes
= nodesself
. _kin
= k_in
class Louvain ( ) : def __init__ ( self
, G
) : self
. _G
= Gself
. _m
= 0 self
. _cid_vertices
= { } self
. _vid_vertex
= { } for vid
in self
. _G
. keys
( ) : self
. _cid_vertices
[ vid
] = set ( [ vid
] ) self
. _vid_vertex
[ vid
] = Vertex
( vid
, vid
, set ( [ vid
] ) ) self
. _m
+= sum ( [ 1 for neighbor
in self
. _G
[ vid
] . keys
( ) if neighbor
> vid
] ) def first_stage ( self
) : mod_inc
= False visit_sequence
= self
. _G
. keys
( ) random
. shuffle
( list ( visit_sequence
) ) while True : can_stop
= True for v_vid
in visit_sequence
: v_cid
= self
. _vid_vertex
[ v_vid
] . _cidk_v
= sum ( self
. _G
[ v_vid
] . values
( ) ) + self
. _vid_vertex
[ v_vid
] . _kincid_Q
= { } for w_vid
in self
. _G
[ v_vid
] . keys
( ) : w_cid
= self
. _vid_vertex
[ w_vid
] . _cid
if w_cid
in cid_Q
: continue else : tot
= sum ( [ sum ( self
. _G
[ k
] . values
( ) ) + self
. _vid_vertex
[ k
] . _kin
for k
in self
. _cid_vertices
[ w_cid
] ] ) if w_cid
== v_cid
: tot
-= k_vk_v_in
= sum ( [ v
for k
, v
in self
. _G
[ v_vid
] . items
( ) if k
in self
. _cid_vertices
[ w_cid
] ] ) delta_Q
= k_v_in
- k_v
* tot
/ self
. _m cid_Q
[ w_cid
] = delta_Qcid
, max_delta_Q
= sorted ( cid_Q
. items
( ) , key
= lambda item
: item
[ 1 ] , reverse
= True ) [ 0 ] if max_delta_Q
> 0.0 and cid
!= v_cid
: self
. _vid_vertex
[ v_vid
] . _cid
= cidself
. _cid_vertices
[ cid
] . add
( v_vid
) self
. _cid_vertices
[ v_cid
] . remove
( v_vid
) can_stop
= False mod_inc
= True if can_stop
: break return mod_inc
def second_stage ( self
) : cid_vertices
= { } vid_vertex
= { } for cid
, vertices
in self
. _cid_vertices
. items
( ) : if len ( vertices
) == 0 : continue new_vertex
= Vertex
( cid
, cid
, set ( ) ) for vid
in vertices
: new_vertex
. _nodes
. update
( self
. _vid_vertex
[ vid
] . _nodes
) new_vertex
. _kin
+= self
. _vid_vertex
[ vid
] . _kin
for k
, v
in self
. _G
[ vid
] . items
( ) : if k
in vertices
: new_vertex
. _kin
+= v
/ 2.0 cid_vertices
[ cid
] = set ( [ cid
] ) vid_vertex
[ cid
] = new_vertexG
= collections
. defaultdict
( dict ) for cid1
, vertices1
in self
. _cid_vertices
. items
( ) : if len ( vertices1
) == 0 : continue for cid2
, vertices2
in self
. _cid_vertices
. items
( ) : if cid2
<= cid1
or len ( vertices2
) == 0 : continue edge_weight
= 0.0 for vid
in vertices1
: for k
, v
in self
. _G
[ vid
] . items
( ) : if k
in vertices2
: edge_weight
+= v
if edge_weight
!= 0 : G
[ cid1
] [ cid2
] = edge_weightG
[ cid2
] [ cid1
] = edge_weightself
. _cid_vertices
= cid_verticesself
. _vid_vertex
= vid_vertexself
. _G
= G
def get_communities ( self
) : communities
= [ ] for vertices
in self
. _cid_vertices
. values
( ) : if len ( vertices
) != 0 : c
= set ( ) for vid
in vertices
: c
. update
( self
. _vid_vertex
[ vid
] . _nodes
) communities
. append
( c
) return communities
def execute ( self
) : iter_time
= 1 while True : iter_time
+= 1 mod_inc
= self
. first_stage
( ) if mod_inc
: self
. second_stage
( ) else : break return self
. get_communities
( ) if __name__
== '__main__' : G
= load_graph
( 's.txt' ) algorithm
= Louvain
( G
) communities
= algorithm
. execute
( ) communities
= sorted ( communities
, key
= lambda b
: - len ( b
) ) count
= 0 for communitie
in communities
: count
+= 1 print ( "社區" , count
, " " , communitie
)
networkx和community社區劃分和可視化
安裝
使用community安裝python-louvain即可 pip install python-louvain pip install networkx
使用
最佳劃分
community
. best_partition
( graph
, partition
= None , weight
= 'weight' , resolution
= 1.0 )
Compute the partition of the graph nodes which maximises the modularity (or try…) using the Louvain heuristics. This is the partition of highest modularity, i.e. the highest partition of the dendrogram generated by the Louvain algorithm.
import community
import networkx
as nx
import matplotlib
. pyplot
as plt
G
= nx
. erdos_renyi_graph
( 30 , 0.05 )
partition
= community
. best_partition
( G
)
size
= float ( len ( set ( partition
. values
( ) ) ) )
pos
= nx
. spring_layout
( G
)
count
= 0 .
for com
in set ( partition
. values
( ) ) : count
= count
+ 1 . list_nodes
= [ nodes
for nodes
in partition
. keys
( ) if partition
[ nodes
] == com
] nx
. draw_networkx_nodes
( G
, pos
, list_nodes
, node_size
= 20 , node_color
= str ( count
/ size
) ) nx
. draw_networkx_edges
( G
, pos
, alpha
= 0.5 )
plt
. show
( )
總結
以上是生活随笔 為你收集整理的Python社区发现—Louvain—networkx和community 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。