Discussion:
Zend_Search_Lucene: wildcard search
Ralf Eggert
2008-01-03 17:32:28 UTC
Permalink
Hi again,

sorry for writing so many new mails about Zend_Search_Lucene, but
questions come up consecutively while working with it.

In the documentation there is a chapter about wildcard search:

http://framework.zend.com/manual/en/zend.search.lucene.query-api.html#zend.search.lucene.queries.wildcard

When I try to use this, I get an fatal error because the class
Zend_Search_Lucene_Search_Query_Wildcard does not exist.

Is the documentation outdated? How can I process a wildcard search now?

I downloaded Luke 0.7.1 to access the index directly. I am able to enter
a search expression like this:

contents:whatever AND (destination:1.40.44.* OR site:2)

This is parsed to

+contents:whatever +(destination:1.40.44.* site:2)

Does Zend_Search_Lucene support these kind of queries yet? If so, how
should I build the query with the Query Construction API?

Thanks and Best Regards,

Ralf
Ralf Eggert
2008-01-03 18:10:23 UTC
Permalink
Hi,

This is weird.
Post by Ralf Eggert
Is the documentation outdated? How can I process a wildcard search now?
In ZF 1.0.3 and the Zend_Search_Lucene_Search_Query_Wildcard class is
missing. But when I look in the SVN the class was changed on 17.07.2007.
So it should be available in the current version.

Why is the file missing?

Best Regards,

Ralf
Ralf Eggert
2008-01-04 12:00:01 UTC
Permalink
Hi,

I downloaded the current snapshot which has the class
Zend_Search_Lucene_Search_Query_Wildcard included. So now, my wildcard
search works. But unfortunately only for a single term.

What do I need to do, when I would want to use rather complex queries
like this one, combining wildcard search with AND and OR operators:

contents:whatever AND (destination:1.40.44.* OR site:2)

or

+contents:whatever +(destination:1.40.44.* site:2)

Any hints how I can extend Zend_Search_Lucene to get this working?

Best Regards,

Ralf
dinok
2008-07-16 09:14:58 UTC
Permalink
Hi guys,

I'm really surprised about the current version. The search runs very well
and returns good results. Also the numeric function is working now, thanks
for the utf8num analyzer.
But the wildcardsearch brings one problem with it. If I search for:
"php" in a index with about 1000 documents, I get the result (about 100
hits) in 0.2seconds.
This is really nice! But when I try: "php *" the search is very slow (takes
about 7seconds!).
Yet another "deadly query" is the "*" which returns a timeout after 60
seconds (Fatal error: Maximum execution time of 60 seconds exceeded in
Zend\Search\Lucene\Storage\File.php on line 302).
Now you might say, check if the query is longer than 3 characters. But what
if the query is "*?**??*". This also returns a timeout..
So is there a possibility to eliminate system killing queries?
The only solution for me is, to allow only one wildcard a query.
But this doesn't solve the "* php" or "php *" or "php (*)" and so on :-/

Any ideas?
Best regards
--
View this message in context: http://www.nabble.com/Zend_Search_Lucene%3A-wildcard-search-tp14601520p18483566.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Matthew Ratzloff
2008-07-16 17:18:37 UTC
Permalink
Unfortunately we've had several problems with Zend_Search_Lucene, including
this. Ultimately we were forced to filter out asterisks and question marks
entirely.
-Matt
Post by dinok
Hi guys,
I'm really surprised about the current version. The search runs very well
and returns good results. Also the numeric function is working now, thanks
for the utf8num analyzer.
"php" in a index with about 1000 documents, I get the result (about 100
hits) in 0.2seconds.
This is really nice! But when I try: "php *" the search is very slow (takes
about 7seconds!).
Yet another "deadly query" is the "*" which returns a timeout after 60
seconds (Fatal error: Maximum execution time of 60 seconds exceeded in
Zend\Search\Lucene\Storage\File.php on line 302).
Now you might say, check if the query is longer than 3 characters. But what
if the query is "*?**??*". This also returns a timeout..
So is there a possibility to eliminate system killing queries?
The only solution for me is, to allow only one wildcard a query.
But this doesn't solve the "* php" or "php *" or "php (*)" and so on :-/
Any ideas?
Best regards
--
http://www.nabble.com/Zend_Search_Lucene%3A-wildcard-search-tp14601520p18483566.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Wil Sinclair
2008-07-16 18:40:32 UTC
Permalink
Is there an issue tracker issue for this. If so, please vote on it. If not, please create one (and vote on it J ).

Time is of the essence; we will be prioritizing issues later this week to fix during our bug squashing next week.



,Wil



From: Matthew Ratzloff [mailto:***@builtfromsource.com]
Sent: Wednesday, July 16, 2008 10:19 AM
To: dinok
Cc: fw-***@lists.zend.com
Subject: Re: [fw-formats] Zend_Search_Lucene: wildcard search



Unfortunately we've had several problems with Zend_Search_Lucene, including this. Ultimately we were forced to filter out asterisks and question marks entirely.



-Matt

On Wed, Jul 16, 2008 at 2:14 AM, dinok <***@gmx.de> wrote:


Hi guys,

I'm really surprised about the current version. The search runs very well
and returns good results. Also the numeric function is working now, thanks
for the utf8num analyzer.
But the wildcardsearch brings one problem with it. If I search for:
"php" in a index with about 1000 documents, I get the result (about 100
hits) in 0.2seconds.
This is really nice! But when I try: "php *" the search is very slow (takes
about 7seconds!).
Yet another "deadly query" is the "*" which returns a timeout after 60
seconds (Fatal error: Maximum execution time of 60 seconds exceeded in
Zend\Search\Lucene\Storage\File.php on line 302).
Now you might say, check if the query is longer than 3 characters. But what
if the query is "*?**??*". This also returns a timeout..
So is there a possibility to eliminate system killing queries?
The only solution for me is, to allow only one wildcard a query.
But this doesn't solve the "* php" or "php *" or "php (*)" and so on :-/

Any ideas?
Best regards
--
View this message in context: http://www.nabble.com/Zend_Search_Lucene%3A-wildcard-search-tp14601520p18483566.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Loading...