Discussion:
Zend Search Lucene : Allowed Memory Exhausted
lekshmi
2008-07-24 05:20:22 UTC
Permalink
Hi,

I am getting allowed memory exhausted error while using find() for
reading indexed records. Following is the code which i'm using.

Zend_Search_Lucene_Analysis_Analyzer::setDefault( new
StandardAnalyzer_Analyzer_Standard_English() );
$index = Zend_Search_Lucene::open(INDEX_PATH);
$userQuery = Zend_Search_Lucene_Search_QueryParser::parse($query);
$userQuery = $index->find($userQuery, 'biz_type', SORT_NUMERIC, SORT_ASC);

If i give $index->find() statement alone, there is no memory exhausted
error. I have to get all 'id's and 'biz_type' from index sorted by biz_type.
Don't know where i'm going wrong. Can someone please help me to solve this
problem. This is very urgent :-(

Thanks,
Lekshmi.
--
View this message in context: http://www.nabble.com/Zend-Search-Lucene-%3A-Allowed-Memory-Exhausted-tp18625572p18625572.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Alexander Veremyev
2008-07-25 12:51:17 UTC
Permalink
Hi,

You probably have too large result set...

If non-default sort order is used, then Zend_Search_Lucene has to
retrieve all matched documents from the index (only set of document IDs
and scores are collected in other case).

It's dramatically increases search time and memory usage.

First way to solve this problem is to use result set limitation
functionality:
-------------------------------
<?php
...
Zend_Search_Lucene::setResultSetLimit($N);
$hits = index->find($userQuery, 'biz_type', SORT_NUMERIC, SORT_ASC);
...
---------------
But you should remember, it's "first N results", but not "best N" in
scoring or ordering field point of view.

Second way is to retrieve not more than N results with best scores
(without additional sorting options), retrieve field values and sort
result manually.


Third way (may be combined with second) is to retrieve complete result
set one by one, store only necessary fields in some arrays and destroy
already processed $hit objects:
-------------------------------
<?php
...
$hits = index->find($userQuery);

$docIDs = array();
$docScores = array();
foreach ($hits as $hitId => $hit) {
$docIDs[] = $hit->id;
$docScores = $hit->score;
}
unset($hits);

foreach ($docIDs as $id) {
$doc = $index->getDocument($id);

$biz_types[] = $doc->biz_type;
$someOtherField[] = $doc->someOtherField;
}
array_multisort($biz_types, SORT_NUMERIC, SORT_ASC,
$docScores, SORT_NUMERIC, SORT_DESC,
$docIDs, SORT_NUMERIC, SORT_ASC,
$someOtherField);
---------------
That removes memory usage overhead of storing completely retrieved
result set wrapped into hit objects (retrieving any stored field invokes
full document loading).


With best regards,
Alexander Veremyev.
-----Original Message-----
Sent: Thursday, July 24, 2008 9:20 AM
Subject: [fw-formats] Zend Search Lucene : Allowed Memory Exhausted
Hi,
I am getting allowed memory exhausted error while using find() for
reading indexed records. Following is the code which i'm using.
Zend_Search_Lucene_Analysis_Analyzer::setDefault( new
StandardAnalyzer_Analyzer_Standard_English() );
$index = Zend_Search_Lucene::open(INDEX_PATH);
$userQuery = Zend_Search_Lucene_Search_QueryParser::parse($query);
$userQuery = $index->find($userQuery, 'biz_type', SORT_NUMERIC, SORT_ASC);
If i give $index->find() statement alone, there is no memory exhausted
error. I have to get all 'id's and 'biz_type' from index sorted by biz_type.
Don't know where i'm going wrong. Can someone please help me to solve this
problem. This is very urgent :-(
Thanks,
Lekshmi.
--
http://www.nabble.com/Zend-Search-Lucene-
%3A-Allowed-Memory-Exhausted-tp18625572p18625572.html
Sent from the Zend MFS mailing list archive at Nabble.com.
Loading...