日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Lucene4:创建查询,并高亮查询关键词

發布時間:2023/12/19 编程问答 28 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Lucene4:创建查询,并高亮查询关键词 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1. 要求

環境:

  Lucene 4.1版本/IKAnalyzer 2012 FF版本/mmseg4j 1.9版本
功能:
  1).高亮查詢演示

注意:

此篇文章開始,索引目錄將不再使用示范目錄,而是使用真實的數據。即LUCENE_INDEX_DIR = "C:\\lucene\\data"改到了LUCENE_INDEX_DIR = "C:\\solr\\news\\data\\index"。

2. 實現代碼

package com.clzhang.sample.lucene;import java.io.*;import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.document.Document; import org.apache.lucene.index.DirectoryReader; import org.apache.lucene.index.Term; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import org.apache.lucene.queryparser.classic.QueryParser; import org.apache.lucene.search.IndexSearcher;import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.search.highlight.Fragmenter; import org.apache.lucene.search.highlight.Highlighter; import org.apache.lucene.search.highlight.QueryScorer; import org.apache.lucene.search.highlight.SimpleSpanFragmenter; import org.apache.lucene.search.highlight.SimpleHTMLFormatter; import org.apache.lucene.search.highlight.TokenSources; import org.apache.lucene.util.Version;import com.chenlb.mmseg4j.Dictionary; import com.chenlb.mmseg4j.analysis.SimpleAnalyzer; import com.chenlb.mmseg4j.analysis.ComplexAnalyzer;import org.junit.Test;/*** 環境:Lucene 4.1版本/IKAnalyzer 2012 FF版本/mmseg4j 1.9版本* 功能:* 1.高亮查詢演示* @author Administrator**/ public class HighlightDemo {// mmseg4j字典路徑private static final String MMSEG4J_DICT_PATH = "C:\\solr\\news\\conf";private static Dictionary dictionary = Dictionary.getInstance(MMSEG4J_DICT_PATH);// Lucene索引存放路徑 private static final String LUCENE_INDEX_DIR = "C:\\solr\\news\\data\\index";@Testpublic void testHighlighting() throws Exception {// 獨立測試Highlighting的代碼String text = "臺保釣人士擬起訴日當局 感謝大陸海監船馳援";TermQuery query = new TermQuery(new Term("title", "當局"));TokenStream tokenStream = new ComplexAnalyzer(dictionary).tokenStream("title", new StringReader(text));QueryScorer scorer = new QueryScorer(query, "title");Fragmenter fragmenter = new SimpleSpanFragmenter(scorer);Highlighter highlighter = new Highlighter(scorer);highlighter.setTextFragmenter(fragmenter);String hlText = highlighter.getBestFragment(tokenStream, text);System.out.println(hlText);System.out.println("--------------------------");}@Testpublic void doHighlightQuery() throws Exception {// 實例化IKAnalyzer分詞器 // Analyzer analyzer = new IKAnalyzer();// 實例化mmseg4j分詞器Analyzer analyzer = new SimpleAnalyzer(dictionary);// 實例化搜索器Directory directory = FSDirectory.open(new File(LUCENE_INDEX_DIR));DirectoryReader reader = DirectoryReader.open(directory);IndexSearcher searcher = new IndexSearcher(reader);final String FIELD_NAME = "webTitle";String keyword = "記者";// 使用QueryParser查詢分析器構造Query對象QueryParser qp = new QueryParser(Version.LUCENE_41, FIELD_NAME, analyzer);Query query = qp.parse(keyword);// 搜索相似度最高的5條記錄TopDocs hits = searcher.search(query, 5);System.out.println("命中:" + hits.totalHits);// 高亮代碼1QueryScorer scorer = new QueryScorer(query, FIELD_NAME);// 下面是指定高亮代碼樣式的代碼SimpleHTMLFormatter simpleHtmlFormatter = new SimpleHTMLFormatter("<EM>", "</EM>"); Highlighter highlighter = new Highlighter(simpleHtmlFormatter, scorer);highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));// 輸出結果for (ScoreDoc scoreDoc : hits.scoreDocs) {Document doc = searcher.doc(scoreDoc.doc);String title = doc.get(FIELD_NAME);// 高亮代碼2TokenStream stream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), scoreDoc.doc, FIELD_NAME, doc, analyzer);String fragment = highlighter.getBestFragment(stream, title);System.out.println(fragment);}reader.close();directory.close();System.out.println("--------------------------");} }

輸出:

臺保釣人士擬起訴日<B>當局</B> 感謝大陸海監船馳援
--------------------------
命中:125
浙江杭州一男子涉嫌毆打<EM>記者</EM>被警方抓獲
領導快看;<EM>記者</EM>曝光!
[視頻]節前聚焦煙花爆竹安全 居民樓內存花炮 <EM>記者</EM>舉報無人監管 20130203
老夫看過<EM>記者</EM>關于肖某勒索的調查視頻,可以說,“脅從犯罪”的證據極為明顯——問題就在于,曾經處理方哦,算是結了案,再次處理,法理上有疑問
<EM>記者</EM>調查:重慶忠縣一樁疑竇重生的受賄案(轉載)
--------------------------

總結

以上是生活随笔為你收集整理的Lucene4:创建查询,并高亮查询关键词的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。