日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Mongodb实现多表join

發布時間:2024/9/15 编程问答 32 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Mongodb实现多表join 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

  • Mongodb實現多表join
    • 1、通過遍歷其他表,插入到當前表
    • 2、優化方式
      • 2.1、mongodb的lookup, 也就是聚合功能
      • 2.2、mapreduce 分布式join多表

Mongodb實現多表join

千萬數量級的table, 如何實現join?

1、通過遍歷其他表,插入到當前表

from pymongo import MongoClientclient = MongoClient("mongodb://192.168.123.64:27017/") temp = client["gd_raw_data"]["temp"] prplregistex = client["gd_raw_data"]["prplregistex"] repairfee = client["gd_raw_data"]["repairfee"] prplcitemcar = client["gd_raw_data"]["prplcitemcar"] lossthirdparty_lossmain = client["gd_raw_data"]["lossthirdparty_lossmain"] lossthirdparty = client["gd_raw_data"]["lossthirdparty"] lossmain = client["gd_raw_data"]["lossmain"] citemkind = client["gd_raw_data"]["citemkind"] check = client["gd_raw_data"]["check"]query = {} cursor = temp.find(query, no_cursor_timeout=True) try:i = 0for doc in cursor:registno = doc['registno']print("報案號:{}".format(registno))prplregistex_info = prplregistex.find_one({ "registno": registno},no_cursor_timeout=True)repairfee_info = repairfee.find_one({ "registno": registno},no_cursor_timeout=True)prplcitemcar_info = prplcitemcar.find_one({ "registno": registno},no_cursor_timeout=True)lossthirdparty_lossmain_info = lossthirdparty_lossmain.find_one({ "registno": registno},no_cursor_timeout=True)lossthirdparty_info = lossthirdparty.find_one({ "registno": registno},no_cursor_timeout=True)lossmain_info = lossmain.find_one({ "registno": registno},no_cursor_timeout=True)citemkind_info = citemkind.find_one({ "registno": registno},no_cursor_timeout=True)check_info = check.find_one({ "registno": registno},no_cursor_timeout=True)newvalues = {"$set": {"prplregistex_info": prplregistex_info,"repairfee_info": repairfee_info,"prplcitemcar_info": prplcitemcar_info,"lossthirdparty_lossmain_info": lossthirdparty_lossmain_info,"lossthirdparty_info": lossthirdparty_info,"lossmain_info": lossmain_info,"citemkind_info": citemkind_info,"check_info": check_info}}temp.update_one({ "registno": registno}, newvalues)finally:client.close()

發現我的PC(i7 6代)實現1700萬多表join需要125個小時,也就是5天5夜,中途服務器容易掛死。

2、優化方式

要么多線程,要么分布式

2.1、mongodb的lookup, 也就是聚合功能

操作之前請務必為關聯的字段創建索引

db.getCollection("prplcmain").aggregate([{"$lookup": {"from": "lida","localField": "registno","foreignField": "registno","as": "carinfo"}},{"$lookup": {"from": "prpldriver","localField": "registno","foreignField": "registno","as": "prpldriver"}},{"$lookup": {"from": "prplinjured","localField": "registno","foreignField": "registno","as": "prplinjured"}},{"$lookup": {"from": "prplinsured","localField": "registno","foreignField": "registno","as": "prplinsured"}},{"$lookup": {"from": "regist","localField": "registno","foreignField": "registno","as": "regist"}},{"$out" : "total"}],{"allowDiskUse" : true} );

這個相同配置下2個小時內可以搞定

2.2、mapreduce 分布式join多表

這個還沒研究透徹
https://stackoverflow.com/questions/38882184/join-two-collections-with-mapreduce-in-mongodb

總結

以上是生活随笔為你收集整理的Mongodb实现多表join的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。