日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問(wèn) 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

Elasticsearch分页查询Fromamp;Size VS scroll

發(fā)布時(shí)間:2024/1/23 编程问答 35 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Elasticsearch分页查询Fromamp;Size VS scroll 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

對(duì)于ES來(lái)說(shuō),按照一般的查詢(xún)流程來(lái)說(shuō),如果我想查詢(xún)數(shù)據(jù):

  • 1 客戶(hù)端請(qǐng)求發(fā)給某個(gè)節(jié)點(diǎn)
  • 2 節(jié)點(diǎn)轉(zhuǎn)發(fā)給個(gè)個(gè)分片,查詢(xún)每個(gè)分片上的前10條
  • 3 結(jié)果返回給節(jié)點(diǎn),整合數(shù)據(jù),提取前10條
  • 4 返回給請(qǐng)求客戶(hù)端

這時(shí),你查詢(xún)的的數(shù)據(jù)可以獲取整個(gè)條數(shù),但是返回的只是默認(rèn)的10條,所以這個(gè)時(shí)候就需要考慮使用分頁(yè)查詢(xún)。

對(duì)于數(shù)據(jù)量,博主在800萬(wàn)條的時(shí)候,用From&Size也是沒(méi)有問(wèn)題的,但是博主有一個(gè)操作需要查詢(xún)一個(gè)大概1億7千萬(wàn)條的數(shù)據(jù),這個(gè)時(shí)候用From&Size在2千萬(wàn)條的時(shí)候就會(huì)出錯(cuò),后來(lái)查了一下From&Size在大數(shù)據(jù)量下性能下降的厲害,導(dǎo)致一些錯(cuò)誤出現(xiàn),所以本博主推薦,能用scroll就用scroll。

下面給出2中使用方式的java代碼:

首先呢,需要在java中引入elasticsearch-jar,比如使用maven:

<dependency><groupId>org.elasticsearch</groupId><artifactId>elasticsearch</artifactId><version>2.3.2</version> </dependency>

然后初始化一個(gè)client對(duì)象:

private static TransportClient client;private static String INDEX = "index_name";private static String TYPE = "type_name";public static TransportClient init(){Settings settings = ImmutableSettings.settingsBuilder().put("client.transport.sniff", true).put("cluster.name", "cluster_name").build();client = new TransportClient(settings).addTransportAddress(new InetSocketTransportAddress("localhost",9300));return client;}public static void main(String[] args) {TransportClient client = init();//這樣就可以使用client執(zhí)行查詢(xún)了}

然后就是創(chuàng)建兩個(gè)查詢(xún)過(guò)程了 ,下面是from-size分頁(yè)的執(zhí)行代碼:

System.out.println("from size 模式啟動(dòng)!"); Date begin = new Date(); long count = client.prepareCount(INDEX).setTypes(TYPE).execute().actionGet().getCount(); SearchRequestBuilder requestBuilder = client.prepareSearch(INDEX).setTypes(TYPE).setQuery(QueryBuilders.matchAllQuery()); for(int i=0,sum=0; sum<count; i++){SearchResponse response = requestBuilder.setFrom(i).setSize(50000).execute().actionGet();sum += response.getHits().hits().length;System.out.println("總量"+count+" 已經(jīng)查到"+sum); } Date end = new Date(); System.out.println("耗時(shí): "+(end.getTime()-begin.getTime()));

下面是scroll分頁(yè)的執(zhí)行代碼,注意啊!scroll里面的size是相對(duì)于每個(gè)分片來(lái)說(shuō)的,所以實(shí)際返回的數(shù)量是:分片的數(shù)量*size

System.out.println("scroll 模式啟動(dòng)!"); begin = new Date(); SearchResponse scrollResponse = client.prepareSearch(INDEX).setSearchType(SearchType.SCAN).setSize(10000).setScroll(TimeValue.timeValueMinutes(1)) .execute().actionGet(); count = scrollResponse.getHits().getTotalHits();//第一次不返回?cái)?shù)據(jù) for(int i=0,sum=0; sum<count; i++){scrollResponse = client.prepareSearchScroll(scrollResponse.getScrollId()) .setScroll(TimeValue.timeValueMinutes(8)) .execute().actionGet();sum += scrollResponse.getHits().hits().length;System.out.println("總量"+count+" 已經(jīng)查到"+sum); } end = new Date(); System.out.println("耗時(shí): "+(end.getTime()-begin.getTime()));

在這里值得一提的是:ES的CURD操作,如果單條數(shù)據(jù)大量數(shù)據(jù)效率一般都比較低,所以要使用bulk操作,例如如下操作:

public static void updateHourByScroll(String Type) throws IOException {System.out.println("scroll 模式啟動(dòng)!");Date begin = new Date();SearchResponse scrollResponse = client.prepareSearch(Index).setTypes(TYPE).setSearchType(SearchType.SCAN).setSize(5000).setScroll(TimeValue.timeValueMinutes(1)).execute().actionGet(); long count = scrollResponse.getHits().getTotalHits();//第一次不返回?cái)?shù)據(jù) for(int i=0,sum=0; sum<count; i++){ scrollResponse = client.prepareSearchScroll(scrollResponse.getScrollId()) .setScroll(TimeValue.timeValueMinutes(8)) .execute().actionGet(); sum += scrollResponse.getHits().hits().length; SearchHits searchHits = scrollResponse.getHits(); List<UpdateRequest> list = new ArrayList<UpdateRequest>(); for (SearchHit hit : searchHits) { String id = hit.getId(); Map<String, Object> source = hit.getSource(); Integer year = Integer.valueOf(source.get("Year").toString()); Integer month = Integer.valueOf(source.get("Mon").toString()); Integer day = Integer.valueOf(source.get("Day").toString()); Integer hour = Integer.valueOf(source.get("Hour").toString()); String time = getyear_month_day_hour(year, month, day, hour); System.out.println(time); UpdateRequest uRequest = new UpdateRequest() .index(Index) .type(Type) .id(id) .doc(jsonBuilder().startObject().field("TimeFormat", time).endObject()); list.add(uRequest); } // 批量執(zhí)行 BulkRequestBuilder bulkRequest = client.prepareBulk(); for (UpdateRequest uprequest : list) { bulkRequest.add(uprequest); } BulkResponse bulkResponse = bulkRequest.execute().actionGet(); if (bulkResponse.hasFailures()) { System.out.println("批量錯(cuò)誤!"); } System.out.println("總量" + count + " 已經(jīng)查到" + sum); } Date end = new Date(); System.out.println("耗時(shí): "+(end.getTime()-begin.getTime())); }

?

總結(jié)

以上是生活随笔為你收集整理的Elasticsearch分页查询Fromamp;Size VS scroll的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。