代码之家 › 专栏 › 技术社区 › Marcelo D. Ré

OrientDB针对lucene搜索的查询结果不正确

orientdb2.2 orientdb

0

Marcelo D. Ré · 技术社区 · 7 年前

我在使用OriteDB Lucene索引时遇到问题。当我使用它进行查询时,它返回一个不完整的数据集。以下是示例:

create class Foo extends V
create property Foo.text string
create index Foo.text_spanish on Foo(text) fulltext engine lucene metadata 
        { "analyzer": "org.apache.lucene.analysis.es.SpanishAnalyzer", 
          "index": "org.apache.lucene.analysis.es.SpanishAnalyzer", 
          "query": "org.apache.lucene.analysis.es.SpanishAnalyzer", 
          "allowLeadingWildcard": true             
}

insert into Foo (text) values ("axxx")
insert into Foo (text) values ("Ã¡xxx")
insert into Foo (text) values ("xxxa")
insert into Foo (text) values ("xxxÃ¡")
insert into Foo (text) values ("xxaxx")
insert into Foo (text) values ("xxÃ¡xx")

select from Foo where text lucene "*a*"

我得到:

xxÃ¡xx
xxaxx
xxxa
axxx

它错过了

Ã¡xxx
xxxÃ¡

如果我运行这个:

select from Foo where text lucene "*Ã¡*"

xxx
xxx

错过了剩下的。即使在这种情况下,它也应该显示xxxx。我做错了什么?

1 回复 | 直到 7 年前

1

dgiannotti 7 年前

默认情况下,OrientDB支持列出的所有分析器 here ,但有些字符未被考虑“ 基础拉丁文 ASCIIFoldingFilter .

创建和编译类后,导入其。jar,然后使用自定义分析器创建索引。

同时,快速解决方案是:

SELECT FROM Foo WHERE text LUCENE "*a*" OR text LUCENE "*Ã¡*";