當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Elasticsearch-搜索并获取数据

發(fā)布時間：2024/1/17 编程问答 37 豆豆

生活随笔收集整理的這篇文章主要介紹了 Elasticsearch-搜索并获取数据小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

Elasticsearch-搜索并獲取數(shù)據(jù)

在group中搜索elasticsearch

curl -XGET "localhost:9200/get-together/group/_search?\ > q=elasticsearch\ > &fields=name,location\ > &size=1\ > $pretty"

URL指出在何處進行查詢：在get-together索引的group類型中
URI參數(shù)給出了搜索的細節(jié)：發(fā)現(xiàn)包含“elasticsearch”的文檔，但是只返回排名靠前結(jié)果的name和location字段

1.在哪里搜索

可以告訴ES在特定的類型和特定索引中進行查詢，但是也可以在同一個索引的多個字段中搜索、在多個索引中搜索或是在所有的索引中搜索。
(1).在多個類型中搜索，使用逗號分隔的列表。如：同時在group和event類型中搜索

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/get-together/group,event/_search?q=elasticsearch&pretty"

(2).通過向索引URL的_search端點發(fā)送請求，可以在某個索引的多個類型中搜索

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/get-together/_search?q=elasticsearch&pretty"

(3).和類型類似，為了在多個索引中搜索，用逗號分隔它們

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/get-together,other_index/_search?q=elasticsearch&pretty"

如果事先沒有創(chuàng)建other-index，這個特定的請求將會失敗。為了忽略這種問題，可以像pretty旗標(biāo)那樣添加ignore_unavailable旗標(biāo)。

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/get-together,other_index/_search?q=elasticsearch&ignore_unavailable&pretty"

(4).在所有的索引中搜索，徹底省略索引的名稱

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/_search?q=elasticsearch&ignore_unavailable&pretty"

如果需要在所有索引內(nèi)搜索，也可以使用名為_all的占位符作為索引的名稱。當(dāng)需要在全部索引中的同一個單獨類型中進行搜索時，這一點就派上用場了，如

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/_all/event/_search?q=elasticsearch&ignore_unavailable&pretty"

2.回復(fù)的內(nèi)容

除了和搜索條件匹配的文檔，搜索答復(fù)還包含其他有價值的信息，用于檢驗搜索的性能或結(jié)果的相關(guān)性。

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/get-together/group/_search?q=Test&ignore_unavailable&pretty" { "took" : 24, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.15342641, "hits" : [ { "_index" : "get-together", "_type" : "group", "_id" : "1", "_score" : 0.15342641, "_source" : { "name" : "ES Test", "organizer" : "Feng" } } ] } }

解析返回結(jié)果

"took" : 24,
"timed_out" : false,
請求耗時多久，以及它是否超時

"total" : 5,
"successful" : 5,
"failed" : 0
查詢了多少分片

"total" : 1,
"max_score" : 0.15342641,
"hits" : [ {
所有匹配文檔的統(tǒng)計數(shù)據(jù)

"hits" : [ {
"_index" : "get-together",
"_type" : "group",
"_id" : "1",
"_score" : 0.15342641,
"_source" : {
"name" : "ES Test",
"organizer" : "Feng"
}
} ]
結(jié)果數(shù)組

(1)時間

"took" : 24, "timed_out" : false,

其中took字段告訴ES花了多久處理請求，時間單位是毫秒，而time_out字段表示搜索請求是否超時。默認情況下，搜索永遠不會超時，但是可以通過timeout參數(shù)來設(shè)定限制。如：設(shè)置3秒超時

FengZhendeMacBook-Pro:bin FengZhen$ curl "localhost:9200/get-together/group/_search?q=Test&ignore_unavailable&pretty&timeout=3s"

如果搜索超時了，timed_out字段的值就是true，而且只能獲得超時前所收集的結(jié)果

(2).分片

"_shards" : { "total" : 5, "successful" : 5, "failed" : 0 },

在一個擁有5份分片的索引中搜索，所有的分片都有返回，所以成功（successful）的值是5，而失敗（failed）的值是0.
當(dāng)一個節(jié)點宕機而且一份分片無法回復(fù)搜索請求時，ES提供正常分片中的結(jié)果，并在failed字段中報告不可搜索的分片數(shù)量。

(3).命中統(tǒng)計數(shù)據(jù)

回復(fù)的最后一項組成元素是hits，它包含了匹配文檔的數(shù)組。在數(shù)組之前，包含了幾項統(tǒng)計信息

"total" : 1, "max_score" : 0.15342641,

將看到匹配文檔的總數(shù)，而且通過max_score會看到這些匹配文檔的最高得分
搜索返回的文檔得分，是該文檔和給定搜索條件的相關(guān)性衡量。得分默認是通過TF-IDF（詞頻-逆文檔頻率）算法進行計算的。詞頻意味著對于搜索的每個詞條，其在某篇文檔中出現(xiàn)的次數(shù)越多則該文檔的得分就越高。逆文檔頻率意味著，如果該詞條在整個文檔集合中出現(xiàn)在越少的文檔中則該文檔得分越高，原因是我們會認為詞條和這篇文檔的相關(guān)度更高。如果詞條經(jīng)常在其它文檔中出現(xiàn)，他可能是一個常見詞，相關(guān)性更低。
文檔的總數(shù)和回復(fù)中的文檔數(shù)量可能不匹配，因為ES默認返回10篇文檔，可以使用size參數(shù)來修改返回的結(jié)果數(shù)量。

(4)結(jié)果文檔

"hits" : [ { "_index" : "get-together", "_type" : "group", "_id" : "1", "_score" : 0.15342641, "_source" : { "name" : "ES Test", "organizer" : "Feng" } } ]

展示了每個匹配文檔所屬的索引和類型、它的ID和它的得分，若在查詢時沒有通過fields指定查詢的字段，則會展示_source字段。和_all一樣，_source是一個特殊的字段，ES默認在其中存儲原始的JSON文檔。
指定fields的查詢：

curl "localhost:9200/get-together/group/_search?q=Test&fields=name,location&ignore_unavailable&pretty" "hits" : [ { "_index" : "get-together", "_type" : "group", "_id" : "1", "_score" : 0.15342641, "fields" : { "name" : [ "ES Test" ] } } ]

3.如何搜索

ES允許使用JSON格式指定所有的搜索條件。當(dāng)搜索變得越來越復(fù)雜的時候，JSON更容易讀寫，并且提供了更多的功能。

FengZhendeMacBook-Pro:bin FengZhen$ curl 'localhost:9200/get-together/group/_search?pretty' -d '{ "query":{ "query_string":{ "query":"Test" } } }'

運行一個類型為query_string的查詢，字符串內(nèi)容是Test

(1).設(shè)置查詢的字符串選項

ES默認查詢_all字段。如果想在分組的名稱里查詢，需要指定：

“default_field”:”name”

同樣，ES默認返回匹配了任一指定關(guān)鍵詞的文檔（默認的操作符是OR）。如果希望匹配所有的關(guān)鍵詞，需要指定：

“default_operator”:”AND”

修改后的查詢：

FengZhendeMacBook-Pro:bin FengZhen$ curl 'localhost:9200/get-together/group/_search?pretty' -d '{ "query":{ "query_string":{ "query":"ES san francisco", “default_field”:”name”, “default_operator”:”AND” } } }'

獲取同樣結(jié)果的另一種方法是查詢字符串中指定字段和操作符

“query”:”name:ES AND name:san AND name:francisco”

(2).選擇合適的查詢類型

如果在name字段中查找“Test”一個詞，term查詢可能更快捷、更直接

FengZhendeMacBook-Pro:bin FengZhen$ curl 'localhost:9200/get-together/group/_search?pretty' -d '{ "query":{ "term":{ "name":"Test" } } }'

(3).使用過濾器

如果對得分不感興趣，可以使用過濾查詢來替代。過濾只關(guān)心一條結(jié)果是否匹配搜索條件，因此，對比相應(yīng)的查詢，過濾查詢更為快速而且更容易緩存。

FengZhendeMacBook-Pro:bin FengZhen$ curl 'localhost:9200/get-together/group/_search?pretty' -d '{ "query":{ “filtered”:{ “filter”:{ "term":{ "name":"Test" } } } } }'

返回的結(jié)果和同樣詞條的查詢相同，但是結(jié)果沒有根據(jù)得分來排序（因為所有的結(jié)果得分都是1.0）

(4).應(yīng)用聚集

除了查詢和過濾，還可以通過聚集進行各種統(tǒng)計。詞條聚集（terms aggregation）。這會展示指定字段中出現(xiàn)的每個詞的計數(shù)器。

curl 'localhost:9200/get-together/group/_search?pretty' -d '{ "aggregations":{ "organizers":{ "terms":{"field":"organizer"} } } }'

聚集解釋：給我一個名為organizers的聚集，類型是terms，并且查找organizers字段

{ "took" : 25, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ { "_index" : "get-together", "_type" : "group", "_id" : "2", "_score" : 1.0, "_source" : { "name" : "Din", "organizer" : "DinDin" } }, { "_index" : "get-together", "_type" : "group", "_id" : "1", "_score" : 1.0, "_source" : { "name" : "ES Test", "organizer" : "Feng" } } ] }, "aggregations" : { "organizers" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "dindin", "doc_count" : 1 }, { "key" : "feng", "doc_count" : 1 } ] } } }

結(jié)果表示，”feng”出現(xiàn)了1次，”dindin”出現(xiàn)了一次。

4.通過ID獲取文檔

為了獲取一個具體的文檔，必須要知道它所屬的索引和類型，以及它的ID。

FengZhendeMacBook-Pro:nacos FengZhen$ curl 'localhost:9200/get-together/group/1?pretty' { "_index" : "get-together", "_type" : "group", "_id" : "1", "_version" : 1, "found" : true, "_source" : { "name" : "ES Test", "organizer" : "Feng" } }

回復(fù)包括所指定的索引、類型和ID。如果文檔存在，會發(fā)現(xiàn)found字段的值是true，此外還有其版本和源。如果文檔不存在，found為false。
通過ID獲得的文檔要比搜索更快，所消耗的資源成本也更低。這也是實時完成的：只要一個索引操作完成了，新的文檔就可以通過GET API獲取。相比之下，搜索時近實時的，因為它們需要等待默認情況下每秒進行一次的刷新操作。

轉(zhuǎn)載于:https://www.cnblogs.com/EnzoDin/p/11000967.html

總結(jié)

以上是生活随笔為你收集整理的Elasticsearch-搜索并获取数据的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： python测试开发django-25.
下一篇： parallels desktop虚拟机