當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

10.completion_suggester

發布時間：2024/2/28 编程问答 21 豆豆

生活随笔收集整理的這篇文章主要介紹了 10.completion_suggester 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

- 1. Completion Suggester 簡介
- 2.存儲doc文檔
- 2. 查詢使用
- 3. 跳過重復的suggestions
- 4. Fuzzy queries
- 4. Regex queries

1. Completion Suggester 簡介

有關不使用 suggest 者的更靈活的search-as-you-type類型的搜索，請參閱search_as_you_type字段類型。

completion suggester 提供自動completion/search-as-you-type功能。這是一項導航功能,就是提示詞功能，可在用戶鍵入內容時指導他們獲得相關結果，從而提高搜索精度。它不適用于term suggest或者phrase suggest拼寫糾正或“您是否要說”功能。

理想情況下，自動completion功能應與用戶鍵入的速度一樣快，以提供與用戶已經鍵入的內容相關的即時反饋。因此，completion suggester 的速度得到了優化。completion suggester使用的數據結構可實現快速查找，但構建成本很高，并且存儲在內存中。

In order to understand the format of suggestions, please read the Suggesters page first. For more flexible search-as-you-type searches that do not use suggesters, see the search_as_you_type field type.

The completion suggester provides auto-complete/search-as-you-type functionality. This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters.

Ideally, auto-complete functionality should be as fast as a user types to provide instant feedback relevant to what a user has already typed in. Hence, completion suggester is optimized for speed. The suggester uses data structures that enable fast lookups, but are costly to build and are stored in-memory.

Mapping

使用這個feature需要為字段定義特殊的mapping
To use this feature, specify a special mapping for this field, which indexes the field values for fast completions.

PUT music {"mappings": {"properties" : {"suggest" : {"type" : "completion"},"title" : {"type": "keyword"}}} }

Copy as cURL
View in Console

Mapping supports the following parameters:

1.analyzer :index analyzer,默認為simple

The index analyzer to use, defaults to simple.

2.search_analyzer: 默認同analyzer

3.preserve_separators
保留分隔符，默認為true。如果禁用，則使用foof進行suggest查找，則可以找到以Foo Fighters開頭的字段。

Preserves the separators, defaults to true. If disabled, you could find a field starting with Foo Fighters, if you suggest for foof.

4.preserve_position_increments
啟用位置增量，默認為true。如果禁用并且使用stop分析器，則使用字符串"b"進行suggest查詢可以獲取以"The Beatles"開頭的字段。注意：您也可以通過索引兩個輸入（Beatles 和 The Beatles）來實現此目的，如果您能夠豐富數據，則無需更改simple analyzer。

Enables position increments, defaults to true. If disabled and using stopwords analyzer, you could get a field starting with The Beatles, if you suggest for b. Note: You could also achieve this by indexing two inputs, Beatles and The Beatles, no need to change a simple analyzer, if you are able to enrich your data.

5.max_input_length
限制單個輸入的長度，默認為50個UTF-16代碼點。此限制僅在索引時間使用，以減少每個輸入字符串的字符總數，以防止大量輸入使基礎數據結構膨脹。大多數用例不會受到默認值的影響，因為前綴補全很少會超出幾個字符。

Limits the length of a single input, defaults to 50 UTF-16 code points. This limit is only used at index time to reduce the total number of characters per input string in order to prevent massive inputs from bloating the underlying datastructure. Most use cases won’t be influenced by the default value since prefix completions seldom grow beyond prefixes longer than a handful of characters.

2.存儲doc文檔

和之前普通的doc index一樣，注意下面的例子中的suggest字段不是啥特殊字段，只是mapping中定義的field name 是suggest，可以是其他的任何字段。
index的是后可以帶一些參數input,weight等

PUT music/_doc/1?refresh {"suggest" : {"input": [ "Nevermind", "Nirvana" ],"weight" : 34} }

Copy as cURL
View in Console

The following parameters are supported:

1.input: 要存儲的輸入，可以是字符串數組，也可以只是字符串。此字段是必填字段。
此值不能包含以下UTF-16控制字符：

This value cannot contain the following UTF-16 control characters:\u0000 (null) \u001f (information separator one) \u001e (information separator two)

2.weight: 正整數或包含正整數的字符串，定義權重并允許您對 suggest 進行排名。該字段是可選的。

對于一個doc的多個input 內容可以這樣

PUT music/_doc/1?refresh {"suggest" : [{"input": "Nevermind","weight" : 10},{"input": "Nirvana","weight" : 3}] }

Copy as cURL
View in Console

或者這樣

PUT music/_doc/1?refresh {"suggest" : [ "Nevermind", "Nirvana" ] }

Copy as cURL
View in Console

2. 查詢使用

suggest 查詢與往常一樣工作，但是必須將 suggest 類型指定為completion。 suggest 幾乎是實時的，這意味著可以通過refresh使新的 suggest 可見，并且一旦刪除就不會顯示文檔。

POST music/_search?pretty {"suggest": {"song-suggest" : { # suggest 名稱"prefix" : "nir", # 使用的前綴"completion" : { # suggest 類型"field" : "suggest" # 對應使用的字段}}} }

Copy as cURL
View in Console

returns

{"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits": ..."took": 2,"timed_out": false,"suggest": {"song-suggest" : [ {"text" : "nir","offset" : 0,"length" : 3,"options" : [ {"text" : "Nirvana","_index": "music","_type": "_doc","_id": "1","_score": 1.0,"_source": {"suggest": ["Nevermind", "Nirvana"]}} ]} ]} }

必須啟用_source元字段，這是默認行為，才能啟用返回帶有 suggest 的_source。

為 suggest 配置的權重以_score的形式返回。text field 使用index 進去的suggest 內容。 suggest 默認情況下返回完整的文檔_source。 _source的大小可能會由于磁盤獲取和網絡傳輸開銷而影響性能。為了節省一些網絡開銷，請使用源過濾從_source過濾掉不必要的字段，以最小化_source大小。請注意，_suggest端點不支持源過濾，但在_search端點上使用 suggest 可以：

POST music/_search {"_source": "suggest", "suggest": {"song-suggest" : {"prefix" : "nir","completion" : {"field" : "suggest", "size" : 5 }}} }

Copy as cURL
View in Console

過濾源以僅返回 suggest 字段
在其中搜索 suggest 的字段名稱
返回的 suggest 數

{"took": 6,"timed_out": false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits": {"total" : {"value": 0,"relation": "eq"},"max_score" : null,"hits" : []},"suggest": {"song-suggest" : [ {"text" : "nir","offset" : 0,"length" : 3,"options" : [ {"text" : "Nirvana","_index": "music","_type": "_doc","_id": "1","_score": 1.0,"_source": {"suggest": ["Nevermind", "Nirvana"]}} ]} ]} }

基本completion suggest 程序查詢支持以下參數：

The basic completion suggester query supports the following parameters:

1.field: 在其上運行查詢的字段的名稱（必填）。
2.size: 返回的 suggest 數（默認為5）。
3.skip_duplicates: 是否應過濾掉重復的 suggest （默認為false）。

completion suggest 考慮索引中的所有文檔。有關如何查詢文檔子集的說明，請參見context suggester 。

The completion suggester considers all documents in the index. See Context Suggester for an explanation of how to query a subset of documents instead.

如果completion查詢跨越一個以上的分片，則中查找 suggest 會分為兩個階段，后一個階段是從相關分片中獲取查詢的結果集，這意味著對單個分片執行completion請求的性能更高。為了獲得最佳的suggest查詢性能，建議將 suggest 索引到單個分片索引中。如果由于分片太大而導致堆使用率很高，則仍然將 suggest 索引到多個分片，而不是針對completion性能進行優化。

3. 跳過重復的suggestions

Skip duplicate suggestions

查詢可以返回來自不同文檔的重復 suggest 。通過將skip_duplicates設置為true，可以修改此行為。設置后，此選項從結果中過濾出帶有重復 suggest 的文檔。

POST music/_search?pretty {"suggest": {"song-suggest" : {"prefix" : "nor","completion" : {"field" : "suggest","skip_duplicates": true}}} }

設置為true時，此選項可能會減慢搜索速度，因為需要訪問更多 suggest 才能找到前N個。

4. Fuzzy queries

completion提示器還支持模糊查詢–這意味著您可以在搜索中輸入拼寫錯誤，并且仍然可以得到結果。

The completion suggester also supports fuzzy queries?—?this means you can have a typo in your search and still get results back.

POST music/_search?pretty {"suggest": {"song-suggest" : {"prefix" : "nor","completion" : {"field" : "suggest","fuzzy" : {"fuzziness" : 2}}}} }

Copy as cURL
View in Console

與查詢前綴共享最長前綴的 suggest 得分更高。
模糊查詢可以采用特定的模糊參數。支持以下參數：

1.fuzziness: 模糊因子，默認為AUTO。有關允許的設置，請參見模糊性。

2.transpositions: 如果設置為true，則位置互換計為一次更改而不是兩次更改，默認為true

3.min_length: 返回模糊 suggest 之前的最小輸入長度，默認值為3

4.prefix_length: 輸入的最小長度（不檢查模糊替代項）默認為1

5.unicode_aware: 如果為true，則所有度量（如模糊編輯距離，位置互換和長度）均以Unicode代碼數量計算而不是以字節為單位。這比使用原始字節略慢，因此默認情況下將其設置為false。

如果要堅持默認值，但仍要使用Fuzzy，則可以使用Fuzzy：{}或Fuzzy：true。

4. Regex queries

completion提示器還支持正則表達式查詢，這意味著您可以將前綴表示為正則表達式

The completion suggester also supports regex queries meaning you can express a prefix as a regular expression

POST music/_search?pretty {"suggest": {"song-suggest" : {"regex" : "n[ever|i]r","completion" : {"field" : "suggest"}}} }

Copy as cURL
View in Console

The regex query can take specific regex parameters. The following parameters are supported:

flags

Possible flags are ALL (default), ANYSTRING, COMPLEMENT, EMPTY, INTERSECTION, INTERVAL, or NONE. See regexp-syntax for their meaning

max_determinized_states

Regular expressions are dangerous because it’s easy to accidentally create an innocuous looking one that requires an exponential number of internal determinized automaton states (and corresponding RAM and CPU) for Lucene to execute. Lucene prevents these using the max_determinized_states setting (defaults to 10000). You can raise this limit to allow more complex regular expressions to execute.

總結

以上是生活随笔為你收集整理的10.completion_suggester的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

completion_suggester

上一篇： 08.suggester02term_s
下一篇： 11.context_suggester