Full Text search with MySQL

Goodbye MySQL

I was optimistic that I could make http://www.cenite.com, a price monitoring website to use the fulltext search of mysql. Unfortuntly I have found so many drawbacks that I have to leave this idea. The main source for information for me was:

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

http://devzone.zend.com/node/view/id/1304#Heading14

At first the speed was wonderfull. I was searching in 300k> records apx. 350mb. But then I have to surrender. I cant configure mysql to work as I want. I know that If I spend two days to become expert on C/C++ with Unicode I will success but this is not the case. I want working solution.

The resons that make me not to use mysql for searching:

There is no way to change the default operator by default it is OR. You must parse the user query and rewrite it.

I want automaticaly truncation on all my terms.

There is no way to tell MySQL what are characters, and what are not…..sorry, there are two ways:

1. Touching the sources,

2. Configuring in xmls

No documentation on both. Maybe there is …somewhere.

If you use the default configuration then you will wonder how you get or not get the required results.

Here is a summary of the comands that I use to tweak my mysql server:

SHOW VARIABLES LIKE ‘ft%’
SET @global.ft_min_word_len=2;
SET @local.ft_min_word_len=2;

But it is better to have those options in my.conf
[mysqld]
ft_min_word_len=3
ft_stopword_file=”C:\MySQL\stop.txt”

[myisamchk]
ft_min_word_len=3
ft_stopword_file=”C:\MySQL\stop.txt”

To check what characterset is using your mysql:

SHOW VARIABLES LIKE ‘character_sets_dir’;

/usr/share/mysql/charsets | E:MySQL Server 5.1sharecharsets

SHOW VARIABLES LIKE ‘characte%’
After that you need to rebuild your index with one of those commands
slow: REPAIR TABLE products QUICK;
slow: myisamchk –recover –ft_min_word_len=3 tbl_name.MYI
fastest: DROP INDEX …; CREATE INDEX….;

Alternatives

http://www.sphinxsearch.com/

http://endeca.com/

http://lucene.apache.org/solr/