日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Elasticsearch7.x学习

發布時間:2023/12/18 编程问答 22 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Elasticsearch7.x学习 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

  • 一 Elasticsearch介紹
    • 1.1 引言
    • 1.2 ES的介紹
    • 1.3 ES和Solr對比
    • 1.4 倒排索引
  • 二 ES安裝
  • 三 ES基本操作
    • 3.1 ES的結構
      • 3.1.1 索引
      • 3.1.2 類型
      • 3.1.3 文檔
      • 3.1.4 屬性
    • 3.2 ES的RESTful語法
      • 3.2.1 使用RESTful語法
      • 3.2.2 ES中Field可以指定的類型
  • 四 java 操作ElasticSearch
  • 4.1 準備環境和基礎類
    • 4.2 準備索引和文檔
      • 4.2.1 索引
        • 4.2.1.1 創建索引
        • 4.2.1.2 檢查索引是否存在
        • 4.2.1.3 刪除索引
      • 4.2.2 文檔
        • 4.2.2.1添加文檔
        • 4.2.2.2 批量添加
        • 4.2.2.3 修改文檔
        • 4.2.2.4 刪除文檔
      • 4.2.3 查詢
        • 4.2.3.1 term
        • 4.2.3.2 terms
        • 4.2.3.3 match_all
        • 4.2.3.4 match
        • 4.2.3.5 multi_match
        • 4.2.3.6 id
        • 4.2.3.7 ids
        • 4.2.3.8 prefix
        • 4.2.3.9 fuzzy
        • 4.2.3.10 wildcard
        • 4.2.3.11 rang
        • 4.2.3.13 regexp
        • 4.2.3.13 深分頁 scrol l
        • 4.2.3.14 delete-by-query
        • 4.2.3.15 bool
        • 4.2.3.16 boosting
        • 4.2.3.17 filter
        • 4.2.3.18 高亮查詢
        • 4.2.3.19 聚合查詢
        • 4.2.3.20 去重計數聚合查詢
        • 4.2.3.21 范圍統計
        • 4.2.3.22 統計聚合
      • 4.2.4 地圖經緯度搜索
        • 4.2.4.1 準備數據
        • 4.2.4.2 ES 的地圖檢索方式

一 Elasticsearch介紹

1.1 引言

  • 在海量數據中執行搜索功能時,如果使用MySQL,效率太低。

  • 如果關鍵字輸入的不準確,一樣可以搜索到想要的數據。

  • 將搜索關鍵字,以紅色的字體展示。

  • 1.2 ES的介紹

    ES是一個使用java語言并且基于Lucene編寫的搜索引擎框架,它提供了分布式的全文搜索功能,提供了一個統一的基于RESTFUL風格的WEB接口,官方客戶端也對多種語言都提供了相應的API

    Lucene:本身就是一個搜索引擎的底層

    分布式:ES主要為了突出它的橫向擴展能力

    全文檢索:將一段詞語進行分詞,并且將分出的單個詞語統一放到一個分詞庫中,在搜索時,根據關鍵字去分詞中檢索,找到匹配的內容。(倒排索引)

    RESTful風格的web接口:操作ES很簡單,只需要發送一個HTTP請求,并且根據請求方式不同,攜帶參數不同,執行相應的功能

    應用廣泛:Github、WIKI

    1.3 ES和Solr對比

  • solr在查詢死數據的時候,速度相對ES更快一些,但是如果數據是實時改變的,Solr的查詢效率會降低很多很多,但是ES的查詢效率基本沒有變化
  • Solr搭建需要Zookeeper來幫助管理。ES本身就支持集群的搭建,不需要第三方介入
  • 最開始solr的社區可以說是非?;鸨?#xff0c;但針對國內的文檔不多。在ES出現之后,ES的社區火爆程度直線上升,ES的文檔非常健全。
  • ES對現在的云計算和大數據支持的比較好。
  • 1.4 倒排索引

  • 將存放的數據,以一定的方式進行分詞,并且將分詞的內容存放到一個單獨的分詞庫中。
  • 當用戶去查詢數據時,會將用戶的查詢關鍵詞進行分詞
  • 然后去分詞庫中匹配內容,最終得到數據的id標識
  • 根據id標識去存放的位置拉取到指定的數據
  • 二 ES安裝

    version: "3.1" services:elasticsearch:image: elasticsearch:7.7.0restart: alwayscontainer_name: elasticsearchports:- 9200:9200environment:- ES_JAVA_OPTS=-Xms256m -Xmx256m- discovery.type=single-nodekibana:image: kibana:7.7.0restart: alwayscontainer_name: kibanaports:- 5601:5601environment:- elasticsearch_url=http://112.124.21.177depends_on:- elasticsearch

    三 ES基本操作

    3.1 ES的結構

    index(索引)- tyep(類型) - document(文檔)- field(屬性)

    3.1.1 索引

    • ES服務中可以創建多個索引
    • 每一個索引被默認分成1片存儲(7.0以前默認5片)
    • 每一個分片都會存在至少一個備份分片
    • 備份分片默認不會幫助檢索數據,當ES檢索壓力特別大的時候,備份分片才會幫助檢索
    • 備份的分片必須放在不同的服務器中

    3.1.2 類型

    • 一個索引下,有一個默認類型_doc(5.x可以建立多個,6.x只能建立一個)

    3.1.3 文檔

    • 一個類型下,可以有多個文檔,這個文檔就類似于MySQL表中的多行數據

    3.1.4 屬性

    • 一個文檔中,可以包含多個屬性,類似于MySQL表中一行數據有多個列

    3.2 ES的RESTful語法

    3.2.1 使用RESTful語法

    • get請求

    http://ip:port/index 查詢索引信息

    http://ip:port/index/type/doc_id 查詢指定的文檔信息

    • POST請求

    http://ip:port/index/_search 查詢文檔,可以在請求體中添加json字符串來代表查詢條件
    http://ip:port/index/_update/doc_id 修改文檔,在請求體中添加json字符串來代表修改的信息

    • PUT請求

    http://ip:port/index : 創建一個索引,需要在請求體中指定索引的信息

    • DELETE請求

    http://ip:port/index: 刪除
    http://ip:port/index/type/doc_id: 刪除指定的文檔

    3.2.2 ES中Field可以指定的類型

    https://www.elastic.co/guide/en/elasticsearch/reference/7.7/mapping-types.html

    四 java 操作ElasticSearch

    4.1 準備環境和基礎類

    <!-- 1.elasticsearch--><dependency><groupId>org.elasticsearch</groupId><artifactId>elasticsearch</artifactId><version>7.7.0</version></dependency> <!-- 2.elasticsearch 高級API--><dependency><groupId>org.elasticsearch.client</groupId><artifactId>elasticsearch-rest-high-level-client</artifactId><version>7.7.0</version></dependency> <!-- 3.junit--><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>4.12</version></dependency> <!-- 4.lombok--><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><version>1.16.22</version></dependency> public class EsClient {public static RestHighLevelClient getClient() {HttpHost host = new HttpHost("112.124.21.177",9200);RestClientBuilder builder = RestClient.builder(host);RestHighLevelClient client = new RestHighLevelClient(builder);return client;} }

    4.2 準備索引和文檔

    ? 索引:sms-logs-index

    [外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-Emt0Be7H-1598429609407)(C:\Users\yangle\AppData\Roaming\Typora\typora-user-images\image-20200821122606706.png)]

    4.2.1 索引

    4.2.1.1 創建索引

    • ES方式
    PUT /sms-logs-index {"settings": {"number_of_shards": 1, "number_of_replicas": 1},"mappings": {"properties": {"createDate": {"type": "date","format": ["yyyy-MM-dd"]},"sendDate": {"type": "date","format": ["yyyy-MM-dd"]},"longCode": {"type": "keyword"},"mobile": {"type": "keyword"},"cropName": {"type": "text","analyzer": "ik_max_word"},"smsContent": {"type": "text","analyzer": "ik_max_word"},"state": {"type": "integer"},"operatorId": {"type": "integer"},"province": {"type": "keyword"},"ipAddr": {"type": "ip"},"replyTotal": {"type": "integer"},"fee": {"type": "long"}}} }
    • java方式
    public void createIndex() throws Exception {// 1.準備關于索引的settingSettings.Builder settings = Settings.builder().put("number_of_shards", 1).put("number_of_replicas", 1);// 2.準備關于索引的mappingXContentBuilder mappings = JsonXContent.contentBuilder().startObject().startObject("properties").startObject("corpName").field("type", "text").field("analyzer", "ik_max_word").endObject().startObject("createDate").field("type", "date").field("format", "yyyy-MM-dd").endObject().startObject("fee").field("type", "long").endObject().startObject("ipAddr").field("type", "ip").endObject().startObject("longCode").field("type", "keyword").endObject().startObject("mobile").field("type", "keyword").endObject().startObject("operatorId").field("type", "integer").endObject().startObject("province").field("type", "keyword").endObject().startObject("replyTotal").field("type", "integer").endObject().startObject("sendDate").field("type", "date").field("format", "yyyy-MM-dd").endObject().startObject("smsContent").field("type", "text").field("analyzer", "ik_max_word").endObject().startObject("state").field("type", "integer").endObject().endObject().endObject();// 3.將settings和mappings 封裝到到一個Request對象中CreateIndexRequest request = new CreateIndexRequest(INDEX).settings(settings).mapping(mappings);// 4.使用client 去連接ESCreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);System.out.println("response:" + response.toString());}

    4.2.1.2 檢查索引是否存在

    • ES方式
    HEAD /sms-logs-index
    • java方式
    public void exists() throws IOException{GetIndexRequest request = new GetIndexRequest(INDEX);boolean exists = client.indices().exists(request, RequestOptions.DEFAULT);System.out.println(exists);}

    4.2.1.3 刪除索引

    • ES方式
    DELETE /test
    • java方式
    public void delete() throws IOException{DeleteIndexRequest request = new DeleteIndexRequest("test");AcknowledgedResponse delete = client.indices().delete(request,RequestOptions.DEFAULT);System.out.println(delete.isAcknowledged());}

    4.2.2 文檔

    4.2.2.1添加文檔

    • ES方式
    自動生成id #添加文檔,自動生成id POST /book/_doc {"name":"五三教輔","author":"黃云輝","count":100000,"on-sale":"2001-01-01","descr":"買我必上清華" }#添加文檔,手動指定id PUT /book/_doc/1 {"name":"紅樓夢","author":"曹雪芹","count":10000000,"on-sale":"2501-01-01","descr":"中國古代章回體長篇小說,中國古典四大名著之一,一般認為是清代作家曹雪芹所著。小說以賈、史、王、薛四大家族的興衰為背景,以富貴公子賈寶玉為視角,以賈寶玉與林黛玉、薛寶釵的愛情婚姻悲劇為主線,描繪了一批舉止見識出于須眉之上的閨閣佳人的人生百態,展現了真正的人性美和悲劇美" }
    • java方式
    public void createDoc() throws IOException {Student student = new Student("2", "張三2", 22, new Date());String s = JSONObject.toJSONString(student);IndexRequest request = new IndexRequest(INDEX, "_doc", student.getId());request.source(s, XContentType.JSON);IndexResponse response = client.index(request, RequestOptions.DEFAULT);System.out.println(response.toString());}

    4.2.2.2 批量添加

    • ES方式
    POST _bulk { "index" : { "_index" : "test", "_id" : "1" } } { "field1" : "value1" } { "delete" : { "_index" : "test", "_id" : "2" } } { "create" : { "_index" : "test", "_id" : "3" } } { "field1" : "value3" } { "update" : {"_id" : "1", "_index" : "test"} } { "doc" : {"field2" : "value2"} }
    • java方式
    public void bulkCreateDoc() throws Exception {// 1.準備多個json 對象String longCode = "1008687";String mobile = "18659113636";List<String> companies = new ArrayList<>();companies.add("騰訊課堂");companies.add("阿里旺旺");companies.add("海爾電器");companies.add("海爾智家公司");companies.add("格力汽車");companies.add("蘇寧易購");List<String> provinces = new ArrayList<>();provinces.add("北京");provinces.add("重慶");provinces.add("上海");provinces.add("晉城");BulkRequest bulkRequest = new BulkRequest();for (int i = 1; i < 16; i++) {Thread.sleep(1000);SmsLogs s1 = new SmsLogs();s1.setId(i);s1.setCreateDate(new Date());s1.setSendDate(new Date());s1.setLongCode(longCode + i);s1.setMobile(mobile + 2 * i);s1.setCorpName(companies.get(i % 5));s1.setSmsContent(SmsLogs.doc.substring((i - 1) * 100, i * 100));s1.setState(i % 2);s1.setOperatorId(i % 3);s1.setProvince(provinces.get(i % 4));s1.setIpAddr("127.0.0." + i);s1.setReplyTotal(i * 3);s1.setFee(i * 6 + "");String json1 = JSONObject.toJSONString(s1);bulkRequest.add(new IndexRequest(INDEX, "_doc", s1.getId().toString()).source(json1, XContentType.JSON));System.out.println("數據" + i + s1.toString());}// 3.client 執行BulkResponse responses = client.bulk(bulkRequest, RequestOptions.DEFAULT);// 4.輸出結果System.out.println(responses.getItems().toString());}

    4.2.2.3 修改文檔

    • ES方式
    #覆蓋式修改 PUT /test/_doc/1 {"name": "c++" }#基于doc修改 POST /test/_update/1 {"doc": {"name": "c++"} }
    • java方式
    public void updateDoc() throws IOException {Map<String, Object> map = new HashMap<>(16);map.put("name", "李四");UpdateRequest request = new UpdateRequest(INDEX, "1");request.doc(map);UpdateResponse response = client.update(request, RequestOptions.DEFAULT);System.out.println(response.toString());}

    4.2.2.4 刪除文檔

    • ES方式
    DELETE /test/_doc/1
    • java方式
    public void delDoc() throws IOException {DeleteRequest request = new DeleteRequest(INDEX, "1");DeleteResponse delete = client.delete(request, RequestOptions.DEFAULT);System.out.println(delete.getResult());}

    4.2.3 查詢

    4.2.3.1 term

    term 查詢是代表完全匹配,搜索之前不會對你搜索的關鍵字進行分詞,直接拿關鍵字去文檔分詞庫中匹配內容

    • ES方式
    POST /sms-logs-index/_search {"from": 0, "size":5,"query": {"term": {"province": {"value": "北京"}}} }
    • java方式
    public void SearchTermDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.from(0);builder.size(5);builder.query(QueryBuilders.termQuery("province", "北京"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {Map<String, Object> source = hit.getSourceAsMap();System.out.println(source);}}

    4.2.3.2 terms

    terms 和 term 查詢的機制一樣,搜索之前不會對你搜索的關鍵字進行分詞,直接拿 關鍵字 去文檔分詞庫中匹配內容
    terms:是針對一個字段包含多個值
    term : where province =北京
    terms: where province = 北京 or province =? (類似于mysql 中的 in)
    也可針對 text, 只是在分詞庫中查詢的時候不會進行分詞

    • ES方式
    POST /sms-logs-index/_search {"query": {"terms": {"province": ["北京","晉城"]}} }
    • java方式
    public void searchTermsDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.termsQuery("province", "北京", "重慶", "上海"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getIndex());System.out.println(hit.getId());System.out.println(hit.getFields());System.out.println(hit.getSourceAsMap());}}

    4.2.3.3 match_all

    查詢全部內容,不指定查詢條件

    • ES方式
    POST /sms-logs-index/_search {"query":{"match_all": {}} }
    • java方式
    public void searchMatchAllDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.matchAllQuery());request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHits hits1 = response.getHits();SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.4 match

    match 查詢屬于高級查詢,會根據你查詢字段的類型不一樣,采用不同的查詢方式
    查詢的是日期或者數值,他會將你基于字符串的查詢內容轉換為日期或數值對待
    如果查詢的內容是一個不能被分詞的內容(keyword),match 不會將你指定的關鍵字進行分詞
    如果查詢的內容是一個可以被分詞的內容(text),match 查詢會將你指定的內容根據一定的方式進行分詞,去分詞庫中匹配指定的內容
    match 查詢,實際底層就是多個term 查詢,將多個term查詢的結果給你封裝到一起

    • ES方式
    POST /sms-logs-index/_search {"query": {"match": {"smsContent": "偉大戰士"}} }#布爾match查詢 POST /sms-logs-index/_search {"query": {"match": {"smsContent": {# 既包含 戰士 也包含 團隊"query": "戰士 團隊","operator": "and"}}} }
    • java方式
    public void searchMatchDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.matchQuery("smsContent", "在空地"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}#布爾match查詢public void booleanMatchSearch() throws IOException {// 1.創建request對象SearchRequest request = new SearchRequest(index);// 2.創建查詢條件SearchSourceBuilder builder = new SearchSourceBuilder();//--------------------------------------------------------------builder.query(QueryBuilders.matchQuery("smsContent","戰士 團隊").operator(Operator.AND));//--------------------------------------------------------------builder.size(20);request.source(builder);// 3.執行查詢SearchResponse response = client.search(request, RequestOptions.DEFAULT);// 4.輸出查詢結果for (SearchHit hit : response.getHits().getHits()) {System.out.println(hit.getSourceAsMap());}System.out.println(response.getHits().getHits().length);}

    4.2.3.5 multi_match

    match 針對一個field 做檢索,multi_math 針對多個field 進行檢索,多個field對應一個文本。

    • ES方式
    POST /sms-logs-index/_search {"query":{"multi_match": {"query": "北京","fields": ["province","smsContent"]}} }
    • java方式
    public void searchMultiMatch() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.multiMatchQuery("北京", "province", "smsContent"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.6 id

    • ES方式
    GET /sms-logs-index/_doc/1
    • java方式
    public void getById() throws IOException {GetRequest request = new GetRequest(INDEX, "1");GetResponse response = client.get(request, RequestOptions.DEFAULT);System.out.println(response.getSourceAsMap());}

    4.2.3.7 ids

    • ES方式
    POST /sms-logs-index/_search {"query": {"ids": {"values": ["1","2","3"]}} }
    • java方式
    public void getByIds() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.idsQuery().addIds("1", "2", "3"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.8 prefix

    前綴查詢,可以通過一個關鍵字去指定一個field 的前綴,從而查詢到指定文檔

    • ES方式
    POST /sms-logs-index/_search {"query": {"prefix": {"province": {"value": "上"}}} }
    • java方式
    public void searchPrefixDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.prefixQuery("province", "上"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.9 fuzzy

    模糊查詢,我們可以輸入一個字符的大概,ES可以根據輸入的大概去匹配內容。查詢結果不穩定

    • ES方式
    POST /sms-logs-index/_search {"query": {"fuzzy": {"corpName": {"value": "海爾電氣"}}} }
    • java方式
    public void searchFuzzyDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.fuzzyQuery("corpName", "海爾電氣"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.10 wildcard

    通配查詢,同mysql中的like 是一樣的,可以在查詢時,在字符串中指定通配符*和占位符?

    • ES方式
    POST /sms-logs-index/_search {"query": {"wildcard": {"corpName": {"value": "海爾??"}}} }
    • java方式
    public void searchWildCardDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.wildcardQuery("corpName", "海爾*"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.11 rang

    范圍查詢,只針對數值類型,對一個field 進行大于或者小于的范圍指定

    • ES方式
    POST /sms-logs-index/_search {"query": {"range": {"fee": {"gte": 10,"lte": 20}}} }
    • java方式
    public void searchRangDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.rangeQuery("fee").gte(10).lte(20));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.13 regexp

    正則查詢,通過編寫的正則表達式去匹配內容

    ps:prefix,fuzzy,wildcard,regexp查詢效率比較低,要求效率比較高的時候,避免去使用。

    • ES方式
    POST /sms-logs-index/_search {"query": {"regexp": {"mobile": "186[0-9]{9}"}} }
    • java方式
    public void searchRegexpDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.regexpQuery("mobile","186[0-9]{9}"));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.13 深分頁 scrol l

    ES 對from +size時又限制的,from +size 之和 不能大于1W,超過后 效率會十分低下
    原理:
    from+size ES查詢數據的方式,
    第一步將用戶指定的關鍵詞進行分詞,
    第二部將詞匯去分詞庫中進行檢索,得到多個文檔id,
    第三步去各個分片中拉去數據, 耗時相對較長
    第四步根據score 將數據進行排序, 耗時相對較長
    第五步根據from 和size 的值 將部分數據舍棄,
    第六步,返回結果。

    scroll +size ES 查詢數據的方式
    第一步將用戶指定的關鍵詞進行分詞,
    第二部將詞匯去分詞庫中進行檢索,得到多個文檔id,
    第三步,將文檔的id放在一個上下文中
    第四步,根據指定的size去ES中檢索指定個數數據,拿完數據的文檔id,會從上下文中移除
    第五步,如果需要下一頁的數據,直接去ES的上下文中找后續內容。
    第六步,循環第四步和第五步
    scroll 不適合做實時查詢。

    • ES方式
    #scroll 查詢,返回第一頁數據,并將文檔id信息存放在ES上下文中,并指定生存時間 POST /sms-logs-index/_search?scroll=1m {"query": {"match_all": {}},"size": 2,"sort": [{"fee": {"order": "desc"}}] }#根據scroll 查詢下一頁數據 POST _search/scroll {"scroll_id":"DnF1ZXJ5VGhlbkZldGNoAwAAAAAAABbqFk04VlZ1cjlUU2t1eHpsQWNRY1YwWWcAAAAAAAAW7BZNOFZWdXI5VFNrdXh6bEFjUWNWMFlnAAAAAAAAFusWTThWVnVyOVRTa3V4emxBY1FjVjBZZw==","scroll":"1m" }#刪除scroll上下文中的數據 DELETE _search/scroll/DnF1ZXJ5VGhlbkZldGNoAwAAAAAAABchFk04VlZ1cjlUU2t1eHpsQWNRY1YwWWcAAAAAAAAXIBZNOFZWdXI5VFNrdXh6bEFjUWNWMFlnAAAAAAAAFx8WTThWVnVyOVRTa3V4emxBY1FjVjBZZw==
    • java方式
    public void searchScrollDoc() throws IOException {SearchRequest request = new SearchRequest(INDEX);request.scroll(TimeValue.timeValueMinutes(1L));SearchSourceBuilder builder = new SearchSourceBuilder();builder.size(4);builder.sort("fee", SortOrder.DESC);builder.query(QueryBuilders.matchAllQuery());request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);String scrollId = response.getScrollId();System.out.println("-------------首頁---------------------");SearchHit[] hits = response.getHits().getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsMap());}while (true) {SearchScrollRequest request1 = new SearchScrollRequest(scrollId);request1.scroll(TimeValue.timeValueMinutes(1L));SearchResponse scroll = client.scroll(request1, RequestOptions.DEFAULT);SearchHit[] hits1 = scroll.getHits().getHits();if (hits1 != null && hits1.length != 0) {System.out.println("-------------下一頁數據---------------------");for (SearchHit hit : hits1) {System.out.println(hit.getSourceAsMap());}}else {System.out.println("-------------結束---------------------");break;}}ClearScrollRequest clearScrollRequest = new ClearScrollRequest();clearScrollRequest.addScrollId(scrollId);ClearScrollResponse clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);System.out.println("刪除scroll:"+clearScrollResponse.isSucceeded());}

    4.2.3.14 delete-by-query

    根據term,match 等查詢方式去刪除大量索引
    PS:如果要刪除的內容是index下的大部分數據,推薦創建一個新的index,然后把保留的文檔內容,添加到全新的索引

    • ES方式
    POST /sms-logs-index/_delete_by_query {"query": {"range": {"fee": {"gte": 10,"lte": 20}}} }
    • java方式
    public void deleteByQuery() throws IOException {DeleteByQueryRequest request = new DeleteByQueryRequest(INDEX);request.setQuery(QueryBuilders.rangeQuery("fee").lt(50));BulkByScrollResponse response = client.deleteByQuery(request, RequestOptions.DEFAULT);System.out.println(response.toString());}

    4.2.3.15 bool

    復合過濾器,將你的多個查詢條件 以一定的邏輯組合在一起

    must:所有條件組合在一起,表示 and 的意思
    must_not: 將must_not中的條件,全部都不匹配,表示not的意思
    should:所有條件用should 組合在一起,表示or 的意思

    • ES方式
    POST /sms-logs-index/_search {"query": {"bool": {"should": [{"term": {"province": {"value": "晉城"}}},{"term": {"province": {"value": "北京"}}}],"must_not": [{"term": {"operatorId": {"value": "2"}}}],"must": [{"match": {"smsContent": "戰士"}},{"match": {"smsContent": "的"}}]}} }
    • java方式
    public void boolSearch() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();boolQueryBuilder.should(QueryBuilders.termQuery("province","北京"));boolQueryBuilder.should(QueryBuilders.termQuery("province","晉城"));boolQueryBuilder.mustNot(QueryBuilders.termQuery("operatorId",2));boolQueryBuilder.must(QueryBuilders.matchQuery("smsContent","戰士"));boolQueryBuilder.must(QueryBuilders.matchQuery("smsContent","的"));builder.query(boolQueryBuilder);request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);for (SearchHit hit : response.getHits().getHits()) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.16 boosting

    boosting 查詢可以幫助我們去影響查詢后的score
    positive:只有匹配上positive 查詢的內容,才會被放到返回的結果集中
    negative: 如果匹配上了positive 也匹配上了negative, 就可以降低這樣的文檔score.
    negative_boost:指定系數,必須小于1
    關于查詢時,分數時如何計算的:
    搜索的關鍵字再文檔中出現的頻次越高,分數越高
    指定的文檔內容越短,分數越高。
    我們再搜索時,指定的關鍵字也會被分詞,這個被分詞的內容,被分詞庫匹配的個數越多,分數就越高。

    • ES方式
    POST /sms-logs-index/_search {"query": {"boosting": {"positive": {"match": {"smsContent": "戰士"}},"negative": {"match": {"smsContent": "實力"}},"negative_boost": 0.5}} }
    • java方式
    public void boostSearch() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();BoostingQueryBuilder boost = QueryBuilders.boostingQuery(QueryBuilders.matchQuery("smsContent", "戰士"),QueryBuilders.matchQuery("smsContent", "實力")).negativeBoost(0.2f);builder.query(boost);request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);for (SearchHit hit : response.getHits().getHits()) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.17 filter

    query 查詢:根據你的查詢條件,去計算文檔的匹配度得到一個分數,并根據分數排序,不會做緩存的。

    filter 查詢:根據查詢條件去查詢文檔,不去計算分數,而且filter會對經常被過濾的數據進行緩存。

    • ES方式
    POST /sms-logs-index/_search {"query": {"bool": {"filter": [{"term": {"corpName": "格力汽車"}},{"range": {"fee": {"gte": 50}}}]}} }
    • java方式
    public void filterSearch() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();boolQueryBuilder.filter(QueryBuilders.termQuery("corpName","格力汽車"));boolQueryBuilder.filter(QueryBuilders.rangeQuery("fee").gte(50));builder.query(boolQueryBuilder);request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);for (SearchHit hit : response.getHits().getHits()) {System.out.println(hit.getSourceAsMap());}}

    4.2.3.18 高亮查詢

    高亮查詢就是用戶輸入的關鍵字,以一定特殊樣式展示給用戶,讓用戶知道為什么這個結果被檢索出來
    高亮展示的數據,本身就是文檔中的一個field,單獨將field以highlight的形式返回給用戶
    ES提供了一個highlight 屬性,他和query 同級別。
    frament_size: 指定高亮數據展示多少個字符回來
    pre_tags:指定前綴標簽
    post_tags:指定后綴標簽

    • ES方式
    POST /sms-logs-index/_search {"query": {"match": {"smsContent": "戰士"}},"highlight": {"fields": {"smsContent": {}},"pre_tags": "<font color='red'>","post_tags": "</font>", "fragment_size": 10} }
    • java方式
    public void highLightSearch() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();builder.query(QueryBuilders.matchQuery("smsContent","戰士"));HighlightBuilder highlightBuilder = new HighlightBuilder();highlightBuilder.field("smsContent",10).preTags("<font colr='red'>").postTags("</font>");builder.highlighter(highlightBuilder);request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);for (SearchHit hit : response.getHits().getHits()) {System.out.println(hit.getHighlightFields().get("smsContent"));}}

    4.2.3.19 聚合查詢

    ES的聚合查詢和mysql的聚合查詢類似,ES的聚合查詢相比mysql要強大得多。ES提供的統計數據的方式多種多樣。

    #ES 聚合查詢的RSTFul 語法 POST /index/type/_search {"aggs":{"(名字)agg":{"agg_type":{"屬性""值"}}} }

    4.2.3.20 去重計數聚合查詢

    去重計數,cardinality 先將返回的文檔中的一個指定的field進行去重,統計一共有多少條

    • ES方式
    POST /sms-logs-index/_search {"aggs": {"provinceAggs": {"cardinality": {"field": "province"}}} }
    • java方式
    public void aggCardinalityC() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();AggregationBuilder aggregationBuilder = AggregationBuilders.cardinality("provinceAggs").field("province");builder.aggregation(aggregationBuilder);request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);Aggregations aggregations = response.getAggregations();Cardinality provinceAggs = aggregations.get("provinceAggs");System.out.println(provinceAggs.getValue());}

    4.2.3.21 范圍統計

    統計一定范圍內出現的文檔個數,比如,針對某一個field 的值再0100,100200,200~300 之間文檔出現的個數分別是多少
    范圍統計 可以針對 普通的數值,針對時間類型,針對ip類型都可以響應。
    數值 rang
    時間 date_rang
    ip ip_rang

    • ES方式
    POST /sms-logs-index/_search {"aggs": {"rangAggs": {"range": {"field": "fee","ranges": [{"to": 30 ##針對數值方式的范圍統計 from 帶等于效果 ,to 不帶等于效果},{"from": 30,"to": 50},{"from": 50}]}}} }POST /sms-logs-index/_search {"aggs": {"rangAggs": {"date_range": {"field": "sendDate","format": "yyyy-MM-dd", "ranges": [{"to": "2020-08-25"},{"from": "2020-08-25","to": "2021-08-25"},{"from": "2021-08-25"}]}}} }POST /sms-logs-index/_search {"aggs": {"agg": {"ip_range": {"field": "ipAddr","ranges": [{"to": "127.0.0.8"},{"from": "127.0.0.8"}]}}} }
    • java方式
    public void aggRangeC() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();AggregationBuilder aggregationBuilder = AggregationBuilders.range("feeAggs").field("fee").addUnboundedTo(30).addRange(30,60).addUnboundedFrom(60);builder.aggregation(aggregationBuilder);request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);Aggregations aggregations = response.getAggregations();Range feeAggs = aggregations.get("feeAggs");for (Bucket bucket : feeAggs.getBuckets()) {System.out.println(bucket.getDocCount());}}

    4.2.3.22 統計聚合

    • ES方式
    POST /sms-logs-index/_search {"aggs": {"agg": {"extended_stats": {"field": "fee"}}} }
    • java方式
    public void aggExtendedStatsC() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder builder = new SearchSourceBuilder();AggregationBuilder aggregationBuilder = AggregationBuilders.extendedStats("feeAggs").field("fee");builder.aggregation(aggregationBuilder);request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);Aggregations aggregations = response.getAggregations();ExtendedStats feeAggs = aggregations.get("feeAggs");System.out.println("最大值:"+feeAggs.getMax()+"平均值:"+feeAggs.getAvg());}

    4.2.4 地圖經緯度搜索

    4.2.4.1 準備數據

    PUT /map {"settings": {"number_of_shards": 1,"number_of_replicas": 1},"mappings": {"properties":{"name":{"type":"text"},"location":{"type":"geo_point"}}} }PUT /map/_doc/1 {"name":"天安門","location":{"lon": 116.403694,"lat":39.914492} }PUT /map/_doc/2 {"name":"百望山","location":{"lon": 116.26284,"lat":40.036576} }PUT /map/_doc/3 {"name":"北京動物園","location":{"lon": 116.347352,"lat":39.947468} }

    4.2.4.2 ES 的地圖檢索方式

    geo_distance :直線距離檢索方式
    geo_bounding_box: 以2個點確定一個矩形,獲取再矩形內的數據
    geo_polygon:以多個點,確定一個多邊形,獲取多邊形的全部數據

    • ES方式
    #geo_distance POST /map/_search {"query": {"geo_distance": {"location": {"lon": 116.43438,"lat": 39.909816},"distance": 2700,"distance_type": "arc"}} }#geo_bounding_box POST /map/_search {"query": {"geo_bounding_box": {"location": {"top_left": {"lon": 116.278722,"lat": 40.005937},"bottom_right":{"lon": 116.433661,"lat": 39.909705}}}} }#geo_polygon POST /map/_search {"query":{"geo_polygon":{"location":{"points":[{"lon":116.220296,"lat":40.075013},{"lon":116.346777,"lat":40.044751},{"lon":116.236106,"lat":39.981533} ]}}} }
    • java方式
    public void geoPoint() throws IOException {SearchRequest request = new SearchRequest("map");SearchSourceBuilder builder = new SearchSourceBuilder();GeoDistanceQueryBuilder location = QueryBuilders.geoDistanceQuery("location");location.distance("3000");location.point(39.909816,116.43438);builder.query(location);request.source(builder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);for (SearchHit hit : search.getHits().getHits()) {System.out.println(hit.getSourceAsMap());}}public void geoPoint() throws IOException {SearchRequest request = new SearchRequest("map");SearchSourceBuilder builder = new SearchSourceBuilder();GeoBoundingBoxQueryBuilder location = QueryBuilders.geoBoundingBoxQuery("location");location.topLeft().reset(40.005937,116.278722);location.bottomRight().reset(39.909705,116.433661);builder.query(location);request.source(builder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);for (SearchHit hit : search.getHits().getHits()) {System.out.println(hit.getSourceAsMap());}}public void geoPoint() throws IOException {SearchRequest request = new SearchRequest("map");SearchSourceBuilder builder = new SearchSourceBuilder();List<GeoPoint> points = new ArrayList<>();points.add(new GeoPoint(40.075013,116.220296));points.add(new GeoPoint(40.044751,116.346777));points.add(new GeoPoint(39.981533,116.236106));builder.query(QueryBuilders.geoPolygonQuery("location",points));request.source(builder);SearchResponse response = client.search(request, RequestOptions.DEFAULT);for (SearchHit hit : response.getHits().getHits()) {System.out.println(hit.getSourceAsMap());}}

    五 elasticsearch 集群

    集群配置中最重要的兩項是node.name與network.host,每個節點都必須不同。其中node.name是節點名稱主要是在Elasticsearch自己的日志加以區分每一個節點信息。
    discovery.zen.ping.unicast.hosts是集群中的節點信息,可以使用IP地址、可以使用主機名(必須可以解析)。

    elasticsearch.ymlcluster.name: my-els # 集群名稱 node.name: els-node1 # 節點名稱,僅僅是描述名稱,用于在日志中區分 #node.master: true 是否參與master選舉和是否存儲數據 #node.data: true path.data: /opt/elasticsearch/data # 數據的默認存放路徑 path.logs: /opt/elasticsearch/log # 日志的默認存放路徑network.host: 192.168.60.201 # 當前節點的IP地址 http.port: 9200 # 對外提供服務的端口,9300為集群服務的端口 #添加如下內容 #culster transport port transport.tcp.port: 9300 transport.tcp.compress: truediscovery.zen.ping.unicast.hosts: ["192.168.60.201", "192.168.60.202","192.168.60.203"] # 集群個節點IP地址,也可以使用els、els.shuaiguoxia.com等名稱,需要各節點能夠解析,分布式系統整個集群節點個數要為奇數個discovery.zen.minimum_master_nodes: 2 # master選舉最少的節點數,這個一定要設置為N/2+1,其中N是:具有master資格的節點的數量,而不是整個集群節點個數。

    五 elasticsearch 集群

    集群配置中最重要的兩項是node.name與network.host,每個節點都必須不同。其中node.name是節點名稱主要是在Elasticsearch自己的日志加以區分每一個節點信息。
    discovery.zen.ping.unicast.hosts是集群中的節點信息,可以使用IP地址、可以使用主機名(必須可以解析)。

    elasticsearch.yml cluster.name: my-els # 集群名稱 node.name: els-node1 # 節點名稱,僅僅是描述名稱,用于在日志中區分 #node.master: true 是否參與master選舉和是否存儲數據 #node.data: true path.data: /opt/elasticsearch/data # 數據的默認存放路徑 path.logs: /opt/elasticsearch/log # 日志的默認存放路徑network.host: 192.168.60.201 # 當前節點的IP地址 http.port: 9200 # 對外提供服務的端口,9300為集群服務的端口 #添加如下內容 #culster transport port transport.tcp.port: 9300 transport.tcp.compress: truediscovery.zen.ping.unicast.hosts: ["192.168.60.201", "192.168.60.202","192.168.60.203"] # 集群個節點IP地址,也可以使用els、els.shuaiguoxia.com等名稱,需要各節點能夠解析,分布式系統整個集群節點個數要為奇數個discovery.zen.minimum_master_nodes: 2 # master選舉最少的節點數,這個一定要設置為N/2+1,其中N是:具有master資格的節點的數量,而不是整個集群節點個數。

    總結

    以上是生活随笔為你收集整理的Elasticsearch7.x学习的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。