用Lucene做一個簡單的Java搜索工具

2019-11-18 16:18:32

字體：大中小

來源：轉載

供稿：網友

初學LUCene，剛接觸搜索引擎。知道了一點點，想做個小工具，實現根據“單詞”搜索某個java源文件。比如輸入“String”去查詢某些java源文件里用到了這個類。

這個想法的來源是，在以前剛學java時，有一本java基礎教程的書的附帶光盤里有作者寫的一個程序，可以方便初學者查找某些類在哪個實例里出現。當時沒有太在意，覺得作者的代碼很長。所以現在想自己也寫一個這樣的小程序。

開發工具與運行環境：使用Lucene2.0的包，jdk1.5，在WindowsXP下運行。

思路分析與設計：

整個程序里，除了Lucene的必要操作外，就是IO的基本操作了。因為要對某目錄下及其子目錄下的所有Java源文件進行索引，就要用到遞歸，同時要過濾掉非Java源文件。根據這種情況，設計了以下5個類。

主類：索引類（IndexJavaFiles），搜索類（SearchJavaFiles）

異常類：索引異常類(IndexException)，搜索異常類(SearchException)

還有一個文件過濾工廠類（FileFilterFactory）。

異常類不是必要的，特意設計來包裝IO異常、文件異常和Lucene的異常。文件過濾工廠類的出現并不是故弄玄虛，只是不想太多代碼集中一起，就把文件過慮器的設計放到一個類里。下面是程序的完整代碼及注釋。

IndexJavaFiles.java

/**

*indexthejavasourcefiles

package powerwind;

import java.io.*;

import java.util.Date;

import org.apache.lucene.document.*;

import org.apache.lucene.index.IndexWriter;

/**

*@authorPowerwind

*@version1.0

publicclass IndexJavaFiles {

/**

*默認構造方法

public IndexJavaFiles() {

}

/**

* 這個私有遞歸方法由index方法調用，保證index傳入的file是目錄不是文件

*@paramwriter

*@paramfile

*@paramff

*@throwsIndexException

PRivatevoid indexDirectory(IndexWriter writer, File file, FileFilter filter)throws IndexException {

if (file.isDirectory()) {

// 有選擇地（過濾）獲取目錄下的文件和目錄

File[] files = file.listFiles(filter);

// 非空目錄

if (files != null) {

for (int i = 0; i < files.length; i++) {

indexDirectory(writer, files[i], filter);

}

} else {

try {

　　// 這里的file經過先前的過濾

writer.addDocument(parseFile(file));

System.out.println("增加文件： " + file);

} catch (IOException ioe) {

thrownew IndexException(ioe.getMessage());

}

/**

*傳參數是文件就直接索引，若是目錄則交給indexDirectory遞歸

*@paramwriter

*@paramfile

*@paramff

*@throwsIndexException

publicvoid index(IndexWriter writer, File file, FileFilter filter) throws IndexException {

// 確定可讀

if (file.exists() && file.canRead()) {

if (file.isDirectory()) {

indexDirectory(writer, file, filter);

} elseif (filter.accept(file)) {

try {

writer.addDocument(parseFile(file));

System.out.println("增加文件： " + file);

} catch (IOException ioe) {

thrownew IndexException(ioe.getMessage());

}

} else {

System.out.println("指定文件或目錄錯誤，沒有完成索引");

}

/**

*@paramfile

*把File變成Document

private Document parseFile(File file) throws IndexException {

Document doc = new Document();

doc.add(new Field("path", file.getAbsolutePath(), Field.Store.YES,

Field.Index.UN_TOKENIZED));

try {

doc.add(new Field("contents", new FileReader(file)));

} catch (FileNotFoundException fnfe) {

thrownew IndexException(fnfe.getMessage());

}

return doc;

}

進入討論組討論。

index(IndexWriter writer, File file, FileFilter filter)調用私有方法indexDirectory(IndexWriter writer, File file, FileFilter filter)完成文件的索引。

下面是IndexException異常類。

IndexException.java

package powerwind;

publicclass IndexException extends Exception {

public IndexException(String message) {

super("Throw IndexException while indexing files: " + message);

}

下面是FileFilterFactory類，返回一個特定的文件過濾器（FileFilter）。

FileFilterFactory.java

package powerwind;

import java.io.*;

publicclass FileFilterFactory {

/**

*靜態匿名內部類

privatestatic FileFilter filter = new FileFilter() {

publicboolean accept(File file) {

long len;

return file.isDirectory()

(file.getName().endsWith(".java") &&

((len = file.length()) > 0) && len < 1024 * 1024);

}

};

publicstatic FileFilter getFilter() {

returnfilter;

}

main方法

/**

* main方法

publicstaticvoid main(String[] args) throws Exception {

IndexJavaFiles ijf = new IndexJavaFiles();

Date start = new Date();

try {

IndexWriter writer = IndexWriterFactory.newInstance().createWriter("./index", true);

System.out.println("Indexing ...");

ijf.index(writer, new File("."), FileFilterFactory.getFilter());

System.out.println("Optimizing...");

writer.optimize();

writer.close();

Date end = new Date();

System.out.println(end.getTime() - start.getTime() + " total milliseconds");

} catch (IOException e) {

System.out.println(" caught a " + e.getClass() + "/n with message: " + e.getMessage());

}

SearchJavaFiles.java

package powerwind;

import java.io.*;

import org.apache.lucene.analysis.Analyzer;

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.Document;

import org.apache.lucene.index.IndexReader;

import org.apache.lucene.queryParser.*;

import org.apache.lucene.search.*;

publicclass SearchJavaFiles {

private IndexSearcher searcher;

private QueryParser parser;

/**

*@paramsearcher

public SearchJavaFiles(IndexSearcher searcher) {

this.searcher = searcher;

}

/**

*@paramfield

*@paramanalyzer

publicvoid setParser(String field, Analyzer analyzer) {

setParser(new QueryParser(field, analyzer));

}

/**

*@paramparser

publicvoid setParser(QueryParser parser) {

this.parser = parser;

}

/**

*@paramquery

*@returnHits

*@throwsSearchException

public Hits serach(Query query) throws SearchException {

try {

returnsearcher.search(query);

} catch (IOException ioe) {

thrownew SearchException(ioe.getMessage());

}

/**

*@paramqueryString

*@returnHits

*@throwsSearchException

public Hits serach(String queryString) throws SearchException {

if (parser == null)

thrownew SearchException("parser is null!");

try {

returnsearcher.search(parser.parse(queryString));

} catch (IOException ioe) {

thrownew SearchException(ioe.getMessage());

} catch (ParseException pe) {

thrownew SearchException(pe.getMessage());

}

/**

*輸出hits的結果，從start開始到end，不包括end

*@paramhits

*@paramstart

*@paramend

*@throwsSearchException

publicstatic Hits display(Hits hits, int start, int end) throws SearchException {

try {

while (start < end) {

Document doc = hits.doc(start);

String path = doc.get("path");

if (path != null) {

System.out.println((start + 1) + "- " + path);

} else {

System.out.println((start + 1) + "- " + "No such path");

}

start++;

}

} catch (IOException ioe) {

thrownew SearchException(ioe.getMessage());

}

return hits;

}

進入討論組討論。

main方法

/**

*@paramargs

publicstaticvoid main(String[] args) throws Exception {

String field = "contents";

String index = "./index";

finalint rows_per_page = 2;

finalchar NO = 'n';

SearchJavaFiles sjf = new SearchJavaFiles(new IndexSearcher(IndexReader.open(index)));

sjf.setParser(field, new StandardAnalyzer());

BufferedReader in = new BufferedReader(new InputStreamReader(System.in, "UTF-8"));

while (true) {

System.out.println("Query: ");

String line = in.readLine();

if (line == null line.length() < 2) {

System.out.println("eixt query");

break;

}

Hits hits = sjf.serach(line);

System.out.println("searching for " + line + " Result is ");

int len = hits.length();

int i = 0;

if (len > 0)

while (true) {

if (i + rows_per_page >= len) {

SearchJavaFiles.display(hits, i, len);

break;

} else {

SearchJavaFiles.display(hits, i, i += rows_per_page);

System.out.println("more y/n?");

line = in.readLine();

if (line.length() < 1 line.charAt(0) == NO)

break;

}

else

System.out.println("not found");

}

SearchException.java

package powerwind;

publicclass SearchException extends Exception {

public SearchException(String message) {

super("Throw SearchException while searching files: " + message);

}

完善設想：

1、文件格式：

能夠處理Zip文件Jar文件，索引里面的java源文件。

通過反射機制索引class類文件。

2、輸入輸出：

除控制臺輸入輸出外，還可以選擇從文件讀取查詢關鍵字，輸出查詢結果到文件。

3、用戶界面：

圖形界面操作，雙擊查詢結果的某條記錄可以打開相應文件。

4、性能方面

索引文件時，用緩存和多線程處理

進入討論組討論。

（出處：http://m.survivalescaperooms.com）

上一篇：在NetBeans 4.1中使用Swing組件

下一篇：J2EE實戰留言本(升級應用篇)

學習交流

索泰發布一款GTX 1070 Mini迷你版本:小機

索泰發布一款GTX 1070 Mini迷你版本:小機箱大愛...

熱門圖片

猜你喜歡的新聞

猜你喜歡的關注

新聞熱點

榮耀總裁趙明烏鎮演講：榮耀首款5G手機V30下月發布

2019-10-23 09:17:05

搜狐張朝陽：回歸媒體是搜狐重新崛起的關鍵

2019-10-21 09:20:02

華為輪值董事長郭平：虛擬技術創造現實價值

2019-10-21 09:00:12

滴滴英文服務上線兩周年用戶已超200萬

2019-09-26 08:57:12

華為推出全球至快AI訓練集群Atlas900

2019-09-25 08:46:36

馬斯克：特斯拉正組建中國技術團隊

2019-09-25 08:15:43

疑難解答

圖片精選

網友關注

主站蜘蛛池模板：西充县| 名山县| 安乡县| 凯里市| 商河县| 娄烦县| 镶黄旗| 新闻| 宽城| 新平| 合川市| 梅州市| 阳城县| 尉犁县| 绥芬河市| 阳山县| 海林市| 荆州市| 蓬莱市| 威远县| 江阴市| 建德市| 塔河县| 报价| 韶山市| 安仁县| 沙洋县| 德安县| 佳木斯市| 巴青县| 海宁市| 红安县| 长武县| 陈巴尔虎旗| 武山县| 耒阳市| 交口县| 南开区| 汶上县| 岫岩| 西乌珠穆沁旗|

国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

用Lucene做一個簡單的Java搜索工具