代码之家  ›  专栏  ›  技术社区  ›  Roman

重建Lucene索引的正确方法是什么

  •  1
  • Roman  · 技术社区  · 14 年前

    deletable 文件。我想这是因为每次我想重建索引时都会清空索引。下面是处理索引的代码:

    public class SearchService : ISearchService
    {
        Directory   IndexFileLocation;
        IndexWriter Writer;
        IndexReader Reader; 
        Analyzer    Analyzer;
    
        public SearchService(String indexLocation)
        {
            IndexFileLocation = FSDirectory.GetDirectory(indexLocation, System.IO.Directory.Exists(indexLocation) == false);
            Reader            = IndexReader.Open(IndexFileLocation);
            Writer            = new IndexWriter(IndexFileLocation, Analyzer, IndexFileLocation.List().Length == 0);
            Analyzer          = new StandardAnalyzer();
        }
    
        public void ClearIndex()
        {
            var DocumentCount = Writer.DocCount();
            if (DocumentCount == 0)
                return;
    
            for (int i = 0; i < DocumentCount; i++)
                Reader.DeleteDocument(i);
        }
    
        public void AddToSearchIndex(ISearchableData Data)
        {
            Document Doc = new Document();
    
            foreach (var Entry in Data)
            {
                Field field = new Field(Entry.Key, 
                                        Entry.Value, 
                                        Lucene.Net.Documents.Field.Store.NO, 
                                        Lucene.Net.Documents.Field.Index.TOKENIZED, 
                                        Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);
                Doc.Add(field);
            }
    
            Field KeyField = new Field(
                SearchField.Key.ToString(), 
                Data.Key, 
                Lucene.Net.Documents.Field.Store.YES, 
                Lucene.Net.Documents.Field.Index.NO);
    
            Doc.Add(KeyField);
            Writer.AddDocument(Doc);
        }
    
        public void Dispose()
        {
            Writer.Optimize();
            Writer.Close();
            Reader.Close();
        }
    }
    

    下面是执行这一切的代码:

        private void btnRebuildIndex_Click(object sender, EventArgs e)
        {
            using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
            {
                SearchService.ClearIndex();
            }
    
            using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
            {
                Int32 BatchSize = 50;
                Int32 Current = 0;
                var TotalQuestions = SubmissionService.GetQuestionsCount();
    
                while (Current < TotalQuestions)
                {
                    var Questions = SubmissionService.ListQuestions(Current, BatchSize, "Id", Qsparx.SortOrder.Asc);
    
                    foreach (var Question in Questions)
                    {
                        SearchService.AddToSearchIndex(Question.ToSearchableData());
                    }
    
                    Current += BatchSize;
                }
            }
        }
    

    2 回复  |  直到 14 年前
        1
  •  2
  •   Mikos    14 年前

    不知道为什么每次都要重新创建索引。你可以 追加 因此对索引:

    Writer = new IndexWriter(IndexFileLocation, Analyzer,false);
    

    最后的false标志告诉IndexWriter以附加模式打开(即不覆盖)。

        2
  •  0
  •   Roman    14 年前

    事实证明,如果不存在索引文件,那么在IndexWriter之前创建IndexReader并不是一个好主意。我还意识到,即使IndexWriter的AddDocument方法有两个重载(一个w/和一个w/o Analyzer参数),但只有一个带Analyzer参数的重载对我有效。