天天天色,99re9精品视频在线,国产精品资源手机在线播放

Lucene是apache組織的一個用java實現(xiàn)全文搜索引擎的開源項目。其功能非常的強大，api也很簡單。總得來說用Lucene來進行建立和搜索和操作數(shù)據(jù)庫是差不多的(有點像)，Document可以看作是數(shù)據(jù)庫的一行記錄，F(xiàn)ield可以看作是數(shù)據(jù)庫的字段。用lucene實現(xiàn)搜索引擎就像用JDBC實現(xiàn)連接數(shù)據(jù)庫一樣簡單。

Lucene2.0，它與以前廣泛應用和介紹的Lucene 1.4.3并不兼容。 Lucene2.0的下載地址是 http://apache.justdn.org/lucene/java/

例子一：

1、在windows系統(tǒng)下的的C盤，建一個名叫s的文件夾,在該文件夾里面隨便建三個txt文件，隨便起名啦，就叫"1.txt","2.txt"和"3.txt"啦?
其中1.txt的內容如下：

? 中華人民共和國???

全國人民???

2006年??

而"2.txt"和"3.txt"的內容也可以隨便寫幾寫，這里懶寫，就復制一個和1.txt文件的內容一樣吧

2、下載lucene包，放在classpath路徑中?
建立索引:

? package ? ?lighter.iteye.com;???

? import ? ?java.io.BufferedReader;???

? import ? ?java.io.File;???

? import ? ?java.io.FileInputStream;???

? import ? ?java.io.IOException;???

? import ? ?java.io.InputStreamReader;???

? import ? ?java.util.Date;???

? import ? ?org.apache.lucene.analysis.Analyzer;???

? import ? ?org.apache.lucene.analysis.standard.StandardAnalyzer;???

? import ? ?org.apache.lucene.document.Document;???

? import ? ?org.apache.lucene.document.Field;???

? import ? ?org.apache.lucene.index.IndexWriter;???

? /** ? ??

?*?author?lighter?date?2006-8-7??

? ? */ ? ??

? public ? ? ? class ? ?TextFileIndexer? ? {???

???? ? public ? ? ? static ? ? ? void ? ?main(String[]?args)? ? throws ? ?Exception? ? {???

???????? ? /* ? ?指明要索引文件夾的位置,這里是C盤的S文件夾下? ? */ ? ??

????????File?fileDir? ? = ? ? ? new ? ?File( ? " ? c:\\s ? " ? );???

???????? ? /* ? ?這里放索引文件的位置? ? */ ? ??

????????File?indexDir? ? = ? ? ? new ? ?File( ? " ? c:\\index ? " ? );???

????????Analyzer?luceneAnalyzer? ? = ? ? ? new ? ?StandardAnalyzer();???

????????IndexWriter?indexWriter? ? = ? ? ? new ? ?IndexWriter(indexDir,?luceneAnalyzer,???

???????????????? ? true ? );???

????????File[]?textFiles? ? = ? ?fileDir.listFiles();???

???????? ? long ? ?startTime? ? = ? ? ? new ? ?Date().getTime();???

???????????

???????? ? // ? 增加document到索引去??? ?

? ???????? ? for ? ?( ? int ? ?i? ? = ? ? ? 0 ? ;?i? ? < ? ?textFiles.length;?i ? ++ ? )? ? {???

???????????? ? if ? ?(textFiles[i].isFile()???

???????????????????? ? && ? ?textFiles[i].getName().endsWith( ? " ? .txt ? " ? ))? ? {???

????????????????System.out.println( ? " ? File? ? " ? ? ? + ? ?textFiles[i].getCanonicalPath()???

???????????????????????? ? + ? ? ? " ? 正在被索引

. ? " ? );???

????????????????String?temp? ? = ? ?FileReaderAll(textFiles[i].getCanonicalPath(),???

???????????????????????? ? " ? GBK ? " ? );???

????????????????System.out.println(temp);???

????????????????Document?document? ? = ? ? ? new ? ?Document();???

????????????????Field?FieldPath? ? = ? ? ? new ? ?Field( ? " ? path ? " ? ,?textFiles[i].getPath(),???

????????????????????????Field.Store.YES,?Field.Index.NO);???

????????????????Field?FieldBody? ? = ? ? ? new ? ?Field( ? " ? body ? " ? ,?temp,?Field.Store.YES,???

????????????????????????Field.Index.TOKENIZED,???

????????????????????????Field.TermVector.WITH_POSITIONS_OFFSETS);???

????????????????document.add(FieldPath);???

????????????????document.add(FieldBody);???

????????????????indexWriter.addDocument(document);???

????????????} ? ???

????????} ? ???

???????? ? // ? optimize()方法是對索引進行優(yōu)化??? ?

? ????????indexWriter.optimize();???

????????indexWriter.close();???

???????????

???????? ? // ? 測試一下索引的時間??? ?

? ???????? ? long ? ?endTime? ? = ? ? ? new ? ?Date().getTime();???

????????System.out???

????????????????.println( ? " ? 這花費了 ? " ? ??

???????????????????????? ? + ? ?(endTime? ? - ? ?startTime)???

???????????????????????? ? + ? ? ? " ? ?毫秒來把文檔增加到索引里面去! ? " ? ??

???????????????????????? ? + ? ?fileDir.getPath());???

????} ? ???

???? ? public ? ? ? static ? ?String?FileReaderAll(String?FileName,?String?charset)???

???????????? ? throws ? ?IOException? ? {???

????????BufferedReader?reader? ? = ? ? ? new ? ?BufferedReader( ? new ? ?InputStreamReader(???

???????????????? ? new ? ?FileInputStream(FileName),?charset));???

????????String?line? ? = ? ? ? new ? ?String();???

????????String?temp? ? = ? ? ? new ? ?String();???

???????????

???????? ? while ? ?((line? ? = ? ?reader.readLine())? ? != ? ? ? null ? )? ? {???

????????????temp? ? += ? ?line;???

????????} ? ???

????????reader.close();???

???????? ? return ? ?temp;???

????} ? ???

} ? ?

索引的結果：

? File?C:\s\ ? 1 ? .txt正在被索引

.???

中華人民共和國全國人民2006年???

File?C:\s\ ? 2 ? .txt正在被索引

.???

中華人民共和國全國人民2006年???

File?C:\s\ ? 3 ? .txt正在被索引

.???

中華人民共和國全國人民2006年???

這花費了297?毫秒來把文檔增加到索引里面去 ? ! ? c:\s??

3、建立了索引之后，查詢啦....

? package ? ?lighter.iteye.com;???

? import ? ?java.io.IOException;???

? import ? ?org.apache.lucene.analysis.Analyzer;???

? import ? ?org.apache.lucene.analysis.standard.StandardAnalyzer;???

? import ? ?org.apache.lucene.queryParser.ParseException;???

? import ? ?org.apache.lucene.queryParser.QueryParser;???

? import ? ?org.apache.lucene.search.Hits;???

? import ? ?org.apache.lucene.search.IndexSearcher;???

? import ? ?org.apache.lucene.search.Query;???

? public ? ? ? class ? ?TestQuery? ? {???

???? ? public ? ? ? static ? ? ? void ? ?main(String[]?args)? ? throws ? ?IOException,?ParseException? ? {???

????????Hits?hits? ? = ? ? ? null ? ;???

????????String?queryString? ? = ? ? ? " ? 中華 ? " ? ;???

????????Query?query? ? = ? ? ? null ? ;???

????????IndexSearcher?searcher? ? = ? ? ? new ? ?IndexSearcher( ? " ? c:\\index ? " ? );???

????????Analyzer?analyzer? ? = ? ? ? new ? ?StandardAnalyzer();???

???????? ? try ? ? ? {???

????????????QueryParser?qp? ? = ? ? ? new ? ?QueryParser( ? " ? body ? " ? ,?analyzer);???

????????????query? ? = ? ?qp.parse(queryString);???

????????} ? ? ? catch ? ?(ParseException?e)? ? {???

????????} ? ???

???????? ? if ? ?(searcher? ? != ? ? ? null ? )? ? {???

????????????hits? ? = ? ?searcher.search(query);???

???????????? ? if ? ?(hits.length()? ? > ? ? ? 0 ? )? ? {???

????????????????System.out.println( ? " ? 找到: ? " ? ? ? + ? ?hits.length()? ? + ? ? ? " ? ?個結果! ? " ? );???

????????????} ? ???

????????} ? ???

????} ? ?

} ? ??

其運行結果：

? 找到: ? 3 ? ?個結果 ? !

Lucene 其實很簡單的,它最主要就是做兩件事:建立索引和進行搜索?
來看一些在lucene中使用的術語,這里并不打算作詳細的介紹,只是點一下而已----因為這一個世界有一種好東西，叫搜索。

IndexWriter :lucene中最重要的的類之一，它主要是用來將文檔加入索引，同時控制索引過程中的一些參數(shù)使用。

Analyzer :分析器,主要用于分析搜索引擎遇到的各種文本。常用的有StandardAnalyzer分析器,StopAnalyzer分析器,WhitespaceAnalyzer分析器等。

Directory :索引存放的位置;lucene提供了兩種索引存放的位置，一種是磁盤，一種是內存。一般情況將索引放在磁盤上；相應地lucene提供了FSDirectory和RAMDirectory兩個類。

Document :文檔;Document相當于一個要進行索引的單元，任何可以想要被索引的文件都必須轉化為Document對象才能進行索引。

Field ：字段。

IndexSearcher :是lucene中最基本的檢索工具，所有的檢索都會用到IndexSearcher工具;

Query :查詢，lucene中支持模糊查詢，語義查詢，短語查詢，組合查詢等等,如有TermQuery,BooleanQuery,RangeQuery,WildcardQuery等一些類。

QueryParser : 是一個解析用戶輸入的工具，可以通過掃描用戶輸入的字符串，生成Query對象。

Hits :在搜索完成之后，需要把搜索結果返回并顯示給用戶，只有這樣才算是完成搜索的目的。在lucene中，搜索的結果的集合是用Hits類的實例來表示的。

上面作了一大堆名詞解釋，下面就看幾個簡單的實例吧:?
1、簡單的的StandardAnalyzer測試例子

? package ? ?lighter.iteye.com;???

? import ? ?java.io.IOException;???

? import ? ?java.io.StringReader;???

? import ? ?org.apache.lucene.analysis.Analyzer;???

? import ? ?org.apache.lucene.analysis.Token;???

? import ? ?org.apache.lucene.analysis.TokenStream;???

? import ? ?org.apache.lucene.analysis.standard.StandardAnalyzer;???

? public ? ? ? class ? ?StandardAnalyzerTest????

? {???

???? ? // ? 構造函數(shù)，??? ?

? ???? ? public ? ?StandardAnalyzerTest()???

???? ? {???

????} ? ???

???? ? public ? ? ? static ? ? ? void ? ?main(String[]?args)????

???? ? {???

???????? ? // ? 生成一個StandardAnalyzer對象??? ?

? ????????Analyzer?aAnalyzer? ? = ? ? ? new ? ?StandardAnalyzer();???

???????? ? // ? 測試字符串??? ?

? ????????StringReader?sr? ? = ? ? ? new ? ?StringReader( ? " ? lighter?javaeye?com?is?the?are?on ? " ? );???

???????? ? // ? 生成TokenStream對象??? ?

? ????????TokenStream?ts? ? = ? ?aAnalyzer.tokenStream( ? " ? name ? " ? ,?sr);????

???????? ? try ? ? ? {???

???????????? ? int ? ?i ? = ? 0 ? ;???

????????????Token?t? ? = ? ?ts.next();???

???????????? ? while ? (t ? != ? null ? )???

???????????? ? {???

???????????????? ? // ? 輔助輸出時顯示行號??? ?

? ????????????????i ? ++ ? ;???

???????????????? ? <

分享到：

2010-10-15 17:03
瀏覽 133
評論(0)
分類: 編程語言
相關推薦

發(fā)表評論

您還沒有登錄,請您登錄后再發(fā)表評論

[轉]Lucene 2 教程

更多文章、技術交流、商務合作、聯(lián)系博主

微信掃碼或搜索：z360901061

微信掃一掃加我為好友

QQ號聯(lián)系： 360901061

您的支持是博主寫作最大的動力，如果您喜歡我的文章，感覺我的文章對您有幫助，請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧，狠狠點擊下面給點支持吧，站長非常感激您！手機微信長按不能支付解決辦法：請將微信支付二維碼保存到相冊，切換到微信，然后點擊微信右上角掃一掃功能，選擇支付二維碼完成支付。

【本文對您有幫助就好】元

2元

5元

10元

20元

自定義

亚洲免费在线-亚洲免费在线播放-亚洲免费在线观看-亚洲免费在线观看视频-亚洲免费在线看-亚洲免费在线视频

評論

發(fā)表評論