日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

《集体智慧编程》第二章(一)

發(fā)布時(shí)間:2024/9/30 编程问答 44 豆豆
生活随笔 收集整理的這篇文章主要介紹了 《集体智慧编程》第二章(一) 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

一、計(jì)算用戶相似度
1.歐幾里得距離
為了方便以后的讀者學(xué)習(xí),代碼(基于python2.6)全部在最后。
這個(gè)沒什么好說的,在二維空間中就是兩點(diǎn)之間線段的長(zhǎng)度。多維空間中,例如A(x1,x2,x3,…,xn)和B(y1,y2,y3,…,yn),它們的歐幾里得距離計(jì)算公式為

對(duì)應(yīng)代碼:

sum_of_squares = sum([pow(prefs[person1][item] - prefs[person2][item], 2) for item in prefs[person1] if item in prefs[person2]])

后邊要將其歸一化處理,即1除以距離加1,加一是防止分母為0
對(duì)應(yīng)代碼:

return 1/(1 + sqrt(sum_of_squares))

2.皮爾遜相關(guān)系數(shù)
歐幾里得距離是以物品為軸,計(jì)算人物之間的距離;皮爾遜相關(guān)系數(shù)則是以人物為軸,根據(jù)對(duì)物品的評(píng)分結(jié)果相似性計(jì)算任務(wù)相似性。根據(jù)書中的圖可以很好理解。
皮爾遜相關(guān)系數(shù)的計(jì)算公式為:

其中SI=X∩Y,N=len(SI)
看著很龐大,但實(shí)際上,學(xué)過概率論的童鞋就不會(huì)陌生,

對(duì)應(yīng)代碼:

sum1 = sum([prefs[p1][it] for it in si])sum2 = sum([prefs[p2][it] for it in si])sum1Sq = sum([pow(prefs[p1][it], 2) for it in si])sum2Sq = sum([pow(prefs[p2][it], 2) for it in si])pSum = sum([prefs[p1][it] * prefs[p2][it] for it in si])#calculate the Pearson Correlation Scorenum = pSum - (sum1 * sum2/n)den = sqrt((sum1Sq - pow(sum1, 2)/n)*(sum2Sq - pow(sum2, 2)/n))if den == 0: return 0r = num / den

二、推薦
1.計(jì)算與用戶最相似的幾個(gè)用戶,即書中的為評(píng)論者打分
這個(gè)沒啥好說的,上文已經(jīng)計(jì)算出了相似度,直接進(jìn)行下排序就可以了,代碼直接看最后吧。
2.給用戶推薦物品
大致分為兩步:計(jì)算用戶可能給物品打多少分;根據(jù)打分結(jié)果排序,輸出前幾個(gè)結(jié)果。
重點(diǎn)在第一步,即計(jì)算用戶可能給物品打多少分,這里映入了權(quán)重的概念。越是與用戶相似的用戶的物品,權(quán)重越大。也即我們直接把相似度作為權(quán)值賦給每個(gè)用戶。所以我們計(jì)算用戶可能的分?jǐn)?shù)就可以大致概括為如下公式:

書中的表2-2就是計(jì)算過程及結(jié)果的一個(gè)展示(圖我就不貼了),對(duì)應(yīng)代碼如下

totals.setdefault(item, 0)totals[item] += prefs[other][item] * sim#sum of similaritysimSums.setdefault(item, 0)simSums[item] += sim

下面就可以根據(jù)計(jì)算出的結(jié)果進(jìn)行排序了。
附錄:目前為止recommendations.py文件中的代碼如下
請(qǐng)大家忽略我的注釋,我的英文不好,正在努力多用英文,有什么語(yǔ)法錯(cuò)誤大家可以隨時(shí)指正,共勉!

#create a dict about movies critics = {'Lisa Rose':{'Lady in the Water':2.5, 'Snakes on a Plane':3.5, 'Just My Luck':3.0, 'Superman Returns':3.5, 'You, Me and Dupree':2.5, 'The Night Listener':3.0},'Gene Seymour':{'Lady in the Water':3.0, 'Snakes on a Plane':3.5, 'Just My Luck':1.5, 'Superman Returns':5.0, 'The Night Listener':3.0, 'You, Me and Dupree':3.5},'Michael Phillips':{'Lady in the Water':2.5, 'Snakes on a Plane':3.0, 'Superman Returns':3.5, 'The Night Listener':4.0},'Claudia Puig':{'Snakes on a Plane':3.5, 'Just My Luck':3.0, 'The Night Listener':4.5, 'Superman Returns':4.0, 'You, Me and Dupree':2.5},'Mick LaSalle':{'Lady in the Water':3.0, 'Snakes on a Plane':4.0, 'Just My Luck':2.0, 'Superman Returns':3.0, 'The Night Listener':3.0, 'You, Me and Dupree':2.0},'Jack Matthews':{'Lady in the Water':3.0, 'Snakes on a Plane':4.0, 'The Night Listener':3.0, 'Superman Returns':5.0, 'You, Me and Dupree':3.5},'Toby':{'Snakes on a Plane':4.5, 'You, Me and Dupree':1.0, 'Superman Returns':4.0}}from math import sqrt #return a value to judge similarity between person1 and person2 def sim_distance(prefs, person1, person2):#items of both person1 and person2si = {}for item in prefs[person1]:if item in prefs[person2]:si[item] = 1#return 0 if there is no item both of person1 and person2if len(si) == 0: return 0#calculate the distance between person1 and person2sum_of_squares = sum([pow(prefs[person1][item] - prefs[person2][item], 2) for item in prefs[person1] if item in prefs[person2]])return 1/(1 + sqrt(sum_of_squares))#return the Pearson Correlation Score between p1 and p2 def sim_pearson(prefs, p1, p2):#items of both p1 and p2si = {}for item in prefs[p1]:if item in prefs[p2]:si[item] = 1#number of itemsn = len(si)#return 0 if there is no item both of p1 and p2if n == 0: return 1sum1 = sum([prefs[p1][it] for it in si])sum2 = sum([prefs[p2][it] for it in si])sum1Sq = sum([pow(prefs[p1][it], 2) for it in si])sum2Sq = sum([pow(prefs[p2][it], 2) for it in si])pSum = sum([prefs[p1][it] * prefs[p2][it] for it in si])#calculate the Pearson Correlation Scorenum = pSum - (sum1 * sum2/n)den = sqrt((sum1Sq - pow(sum1, 2)/n)*(sum2Sq - pow(sum2, 2)/n))if den == 0: return 0r = num / denreturn r#return the most similarity person def topMatches(prefs, person, n = 5, similarity = sim_pearson):scores = [(similarity(prefs, person, other), other) for other in prefs if other != person]#sorted the listscores.sort()scores.reverse()return scores[0:n]#give the suggest by other score of add power def getRecommendations(prefs, person, similarity = sim_pearson):totals = {}simSums = {}for other in prefs:#do not match with itselfif other == person: continuesim = similarity(prefs, person, other)#ignore values equles zero or less than zeroif sim <= 0: continuefor item in prefs[other]:#only assess movies which himself not yet watchif item not in prefs[person] or prefs[person][item] == 0:#similarity * valuestotals.setdefault(item, 0)totals[item] += prefs[other][item] * sim#sum of similaritysimSums.setdefault(item, 0)simSums[item] += sim#create a normalized list rankings = [(total/simSums[item], item) for item, total in totals.items()]#sortedrankings.sort()rankings.reverse()return rankings

總結(jié)

以上是生活随笔為你收集整理的《集体智慧编程》第二章(一)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。