Discussion:
Search index warnings live server but not local
Jack Sleight
2008-06-26 11:06:05 UTC
Permalink
Hi,
I have a script that builds the lucene search index. When I run it
locally it's all fine, when I run it on the live server I get a lot of
these warnings:

*Notice*: iconv() [function.iconv]: Detected an illegal character in
input string in
*...1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php*
on line *56

*My local server is Windows, running PHP 5.2.4, the live server is
CentOS running PHP 5.1.6

I'm assuming it's some sort of character set/encoding issue, but I've
got no idea what's causing it. Any ideas?
Thanks,
**
--
Jack
Jack Sleight
2008-06-26 12:21:31 UTC
Permalink
Sorry, should have given the manual a better read, fixed it now.
http://framework.zend.com/manual/en/zend.search.lucene.best-practice.html#zend.search.lucene.best-practice.encoding
Post by Jack Sleight
Hi,
I have a script that builds the lucene search index. When I run it
locally it's all fine, when I run it on the live server I get a lot of
*Notice*: iconv() [function.iconv]: Detected an illegal character in
input string in
*...1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php*
on line *56
*My local server is Windows, running PHP 5.2.4, the live server is
CentOS running PHP 5.1.6
I'm assuming it's some sort of character set/encoding issue, but I've
got no idea what's causing it. Any ideas?
Thanks,
--
Jack
--
Jack
Matthew Weier O'Phinney
2008-06-26 12:30:56 UTC
Permalink
I have a script that builds the lucene search index. When I run it locally it's
Notice: iconv() [function.iconv]: Detected an illegal character in input string
in ...1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on
line 56
My local server is Windows, running PHP 5.2.4, the live server is CentOS
running PHP 5.1.6
I'm assuming it's some sort of character set/encoding issue, but I've got no
idea what's causing it. Any ideas?
I've noticed the same on a couple of servers I run as well. The good
news is that it doesn't appear to affect indexing. Perhaps Alex can
chime in and indicate what PHP settings may be involved?
--
Matthew Weier O'Phinney
Software Architect | matthew-C1q0ot2/***@public.gmane.org
Zend Framework | http://framework.zend.com/
Jack Sleight
2008-06-26 12:59:29 UTC
Permalink
Hi Matthew,
Well I figured out what was causing it, some of the fields contained ü
characters, and the encoding was UTF-8, but on the live server it didn't
think the values sent to the Zend_Search_Lucene_Field factory methods
were UTF-8 encoded. I resolved it by specifying UTF-8 as the third
argument. No idea why it didn't do that on my local version, I guess
it's something to do with the locales.

If this is the same reason it's happening for you then you may wish to
double check that it's not affecting indexing, because it was for me.
Take the following field value:

aaaa bbbb cccc dddd uuüu xxxx zzzz

For me, the ü was obviously causing the error, but only the words before
the one containing ü got indexed. A search for xxxx or zzzz would return
nothing.
Post by Matthew Weier O'Phinney
I've noticed the same on a couple of servers I run as well. The good
news is that it doesn't appear to affect indexing. Perhaps Alex can
chime in and indicate what PHP settings may be involved?
--
Jack
Continue reading on narkive:
Loading...