Discussion:
Zend_Search_Lucene: combine wildcard search with other terms?
Alexander Veremyev
2008-01-10 17:58:46 UTC
Permalink
Could you give an example of queries which don't work?


With best regards,
Alexander Veremyev.
-----Original Message-----
Sent: Friday, January 04, 2008 3:00 PM
Subject: Re: [fw-formats] Zend_Search_Lucene: combine wildcard search with
other terms?
Hi,
I downloaded the current snapshot which has the class
Zend_Search_Lucene_Search_Query_Wildcard included. So now, my wildcard
search works. But unfortunately only for a single term.
What do I need to do, when I would want to use rather complex queries
contents:whatever AND (destination:1.40.44.* OR site:2)
or
+contents:whatever +(destination:1.40.44.* site:2)
Any hints how I can extend Zend_Search_Lucene to get this working?
Best Regards,
Ralf
No virus found in this incoming message.
Checked by AVG Free Edition.
05.01.2008 11:46
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.17.13/1214 - Release Date: 08.01.2008 13:38
Ralf Eggert
2008-01-17 07:55:42 UTC
Permalink
Hi Alexander,

sorry, must have missed your mail. Thanks for your reply.

What I would like to do is send a query, which combines the wildcard
search with other terms and even adds some boolean operator:

+contents:whatever +(destination:1.40.44.* site:2)

This should get all documents with the term "whatever" in the contents
field AND (the wildcard value "1.40.44.*" in the destination field OR
the value "2" in the site field).

I was able to successfully process such a query with Luke 0.7.1 but I am
afraid this is not possible with Zend_Search_Lucene (yet).

If I am right and this is not possible yet and you are too busy to solve
this issue in the nearby future, maybe you could be so kind to assist me
in solving this for myself. I really do need this feature as soon as
possible. Please advise and point my in the right direction!

Thanks and Best Regards,

Ralf
Alexander Veremyev
2008-01-17 17:47:49 UTC
Permalink
Hi Ralf,

Queries like "+contents:whatever +(destination:1.40.44.* site:2)" should work correctly.

The problem is in the "1.40.44.*" parsing.
1) You should use TextNum analyzer for indexing and searching if you want numbers to be interpreted as parts of terms
2) '.' are treated as words delimiters. So 'destination:1.40.44.xx' is transformed to phrase: 'destination:"1 40 44 xx"', but if you use 'destination:1.40.44.*' you will get an exception 'Wildcard search is supported only for non-multiple word terms'.
Use your own analyzer or change '.' to some letter.


PS Keyword fields are intended for this case, but Zend_Search_Lucene query parser doesn't support non-tokenized fields now (see http://framework.zend.com/issues/browse/ZF-623 for details).


With best regards,
Alexander Veremyev.
-----Original Message-----
Sent: Thursday, January 17, 2008 10:56 AM
Cc: Alexander Veremyev
Subject: Re: [fw-formats] Zend_Search_Lucene: combine
wildcard search with other terms?
Hi Alexander,
sorry, must have missed your mail. Thanks for your reply.
What I would like to do is send a query, which combines the
+contents:whatever +(destination:1.40.44.* site:2)
This should get all documents with the term "whatever" in the
contents field AND (the wildcard value "1.40.44.*" in the
destination field OR the value "2" in the site field).
I was able to successfully process such a query with Luke
0.7.1 but I am afraid this is not possible with
Zend_Search_Lucene (yet).
If I am right and this is not possible yet and you are too
busy to solve this issue in the nearby future, maybe you
could be so kind to assist me in solving this for myself. I
really do need this feature as soon as possible. Please
advise and point my in the right direction!
Thanks and Best Regards,
Ralf
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.4/1227 - Release
Date: 16.01.2008 1:40
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.4/1227 - Release Date: 16.01.2008 1:40
Ralf Eggert
2008-01-20 08:40:45 UTC
Permalink
Hi Alexander,

thanks, I will give it a try during the next week and send a note to the
list whether I got it to work or not.

Best Regards,

Ralf
Ralf Eggert
2008-01-28 09:23:11 UTC
Permalink
Hi Alexander,

after an upgrade to ZF 1.5.0PR and changing the "1.40.44.*" to
"1x40x44x*" I basically got the wildcard search running. But now I
encountered other problems.

First, I would like to build this query with the Query Construction API
but I don't know how to do it.

+contents:whatever +(destination:1x40x44x* site:2)

Second, when I use the query string above directly I get different
results between Zend_Search_Lucene and Luke. It seems as if Luke
presents the correct results, while Zend_Search_Lucene passes all
documents that match only "+contents:whatever". The rest of the query
string seems to be ignored.

Third, I only indexed 3.000 out of 150.000 documents yet and optimized
the index afterwards. While Luke shows the results almost immediately,
Zend_Search_Lucene already takes 1 second to find the results. Now I am
afraid that due to my wildcard search construct the search time will
rise even more when indexing all 150.000 documents.

Any idea to solve any of these problems?

Thanks and Best Regards,

Ralf

Loading...