paoding java_中文分词器-PaodingAnalyzer
在pom.xml中添加依賴:
com.thihy
elasticsearch-analysis-paoding
1.4.2.1
org.elasticsearch
elasticsearch
1.5.2
到網上下載paoding分詞器
在src/main/resources/paoding建立文件:paoding-analysis.properties,內容如下paoding.analyzer.mode=most-words
paoding.analyzer.dictionaries.compiler=net.paoding.analysis.analyzer.impl.MostWordsModeDictionariesCompiler
paoding.dic.home=classpath:paoding/dic
paoding.dic.detector.interval=60
paoding.knife.class.letterKnife=net.paoding.analysis.knife.LetterKnife
paoding.knife.class.numberKnife=net.paoding.analysis.knife.NumberKnife
paoding.knife.class.cjkKnife=net.paoding.analysis.knife.CJKKnife
將dic文件夾拷貝到src/main/resources/paoding下
測試@Test
public?void?test()?throws?IOException?{
Analyzer?analyzer?=?new?PaodingAnalyzer("classpath:paoding/paoding-analysis.properties");
String?text?=?"我愛北京天安門";
TokenStream?tokenStream?=?analyzer.tokenStream("",?text);
tokenStream.reset();
while?(tokenStream.incrementToken())?{
CharTermAttribute?charTermAttribute?=?tokenStream
.addAttribute(CharTermAttribute.class);
System.out.println(charTermAttribute);
}
}
運行單元測試,控制臺輸出:我愛
北京
天安
天安門
總結
以上是生活随笔為你收集整理的paoding java_中文分词器-PaodingAnalyzer的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: “人间本无事”上一句是什么
- 下一篇: java强引用弱引用_Java 的强引用