Discussion:
Lucene : Numeric value ignored in search query
Jean-Marc Fontaine
2008-09-23 13:25:26 UTC
Permalink
Hello,

I use Lucene to search documents. The actual content of the documents is
stored in the database. I index the documents ids and content in Lucene for
search purposes.

When a document is removed from my database, I need to remove it from Lucene
index. To do so, I need to find the Lucene document id which is different
from my document id.

When I search for "id:2" for example, my query is considered insignificant
by the query parser. I tried to add some prefix to avoid potential minimal
length but only the prefix is search for.

Any idea anyone ? :)
--
View this message in context: http://www.nabble.com/Lucene-%3A-Numeric-value-ignored-in-search-query-tp19627596p19627596.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Jean-Marc Fontaine
2008-09-23 13:50:10 UTC
Permalink
Found the solution to this : you must specify an analyser containing "Num" in
its name (eg. Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num).

Use this to do so :

Zend_Search_Lucene_Analysis_Analyzer::setDefault(
new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num()
);
--
View this message in context: http://www.nabble.com/Lucene-%3A-Numeric-value-ignored-in-search-query-tp19627596p19628128.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Alexander Veremyev
2008-10-11 11:07:17 UTC
Permalink
Hi!

Indexed documents may have two type of id's:

1. Internal document id returned by $hit->id and used by
$index->getDocument(), $index->delete() and some other methods.

This id _may_ and _will_ be changed while index optimization (or
auto-optimization) and can't be used to refer indexed document.

This id also can't be used in search queries.


2. Some unique (or not unique) value added to document while indexing:
...
$doc->addField(Zend_Search_Lucene_Field::Keyword('DB_id', $dbId));
...

This field can be used for searching document:
$hits = $index->find('DB_id:2');

or (better) directly retrieving documents:
...
$docIDs = $index->TermDocs(new Zend_Search_Lucene_Index_Term('2',
'DB_id'));
foreach ($docIDs as $docId) {
$index->delete($docId);
}


PS All these things are described in the documentation ;)

With best regards,
Alexander Veremyev.
-----Original Message-----
Sent: Tuesday, September 23, 2008 5:25 PM
Subject: [fw-formats] Lucene : Numeric value ignored in search query
Hello,
I use Lucene to search documents. The actual content of the documents is
stored in the database. I index the documents ids and content in
Lucene
for
search purposes.
When a document is removed from my database, I need to remove it from Lucene
index. To do so, I need to find the Lucene document id which is different
from my document id.
When I search for "id:2" for example, my query is considered
insignificant
by the query parser. I tried to add some prefix to avoid potential minimal
length but only the prefix is search for.
Any idea anyone ? :)
--
http://www.nabble.com/Lucene-%3A-Numeric-
value-ignored-in-search-query-tp19627596p19627596.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Jean-Marc Fontaine
2008-10-12 09:24:42 UTC
Permalink
Hi Alexander,

thank you for your answer but I think you missed the point in my question.
:)

I read the documentation and I know the difference between my DB ids and
Lucene ids. BTW, you can name your search field "id" if you will. The only
pitfall if you do so is that retrieving $document->id will return Lucene id
and not your DB id. To get the DB id you must use the
$document->getField('id') method.

As I said in my second message, my problem was coming from the default
analyser which do not allow to index numeric values. Using another analyzer
solved the problem.

Anyway, thank you for trying to help. ;)

Regards,

Jean-Marc
--
View this message in context: http://www.nabble.com/Lucene-%3A-Numeric-value-ignored-in-search-query-tp19627596p19939900.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Loading...