Full Text search with MySQL

Goodbye MySQL

I was optimistic that I could make http://www.cenite.com, a price monitoring website to use the fulltext search of mysql. Unfortuntly I have found so many drawbacks that I have to leave this idea. The main source for information for me was:

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

http://devzone.zend.com/node/view/id/1304#Heading14

At first the speed was wonderfull. I was searching in 300k> records apx. 350mb. But then I have to surrender. I cant configure mysql to work as I want. I know that If I spend two days to become expert on C/C++ with Unicode I will success but this is not the case. I want working solution.

The resons that make me not to use mysql for searching:

There is no way to change the default operator by default it is OR. You must parse the user query and rewrite it.

I want automaticaly truncation on all my terms.

There is no way to tell MySQL what are characters, and what are not…..sorry, there are two ways:

1. Touching the sources,

2. Configuring in xmls

No documentation on both. Maybe there is …somewhere.

If you use the default configuration then you will wonder how you get or not get the required results.

Here is a summary of the comands that I use to tweak my mysql server:

SHOW VARIABLES LIKE ‘ft%’
SET @global.ft_min_word_len=2;
SET @local.ft_min_word_len=2;

But it is better to have those options in my.conf
[mysqld]
ft_min_word_len=3
ft_stopword_file=”C:\MySQL\stop.txt”

[myisamchk]
ft_min_word_len=3
ft_stopword_file=”C:\MySQL\stop.txt”

To check what characterset is using your mysql:

SHOW VARIABLES LIKE ‘character_sets_dir’;

/usr/share/mysql/charsets | E:MySQL Server 5.1sharecharsets

SHOW VARIABLES LIKE ‘characte%’
After that you need to rebuild your index with one of those commands
slow: REPAIR TABLE products QUICK;
slow: myisamchk –recover –ft_min_word_len=3 tbl_name.MYI
fastest: DROP INDEX …; CREATE INDEX….;

Alternatives

http://www.sphinxsearch.com/

http://endeca.com/

http://lucene.apache.org/solr/

Everyday SQL statements

Tools

Status

SHOW status where Variable_name like ‘Th%’ or Variable_name like ‘%Connec%’ ;
SHOW [GLOBAL | SESSION] STATUS [LIKE ‘pattern’ | WHERE expr]

Check/Repair tables

mysqlcheck -u root -p***** –auto-repair –check –optimize –all-databases

Profiling

watch -n 0.5 ‘mysql -u root -ppass -e “SHOW FULL PROCESSLIST” | grep Query’

http://opendba.blogspot.com/2008/03/mysql-finally-ability-to-traceprofile.html

Dump

pg_dump -U test arachnid_archiv_test –inserts -h chaos.spider.bg –encoding=utf8 -f pgsql.sql

Dump for full backup with flushing of the log files

mysqldump -h $MYSQL_HOST -u $MYSQL_USER -p$MYSQL_PASS
–single-transaction –all-databases –delete-master-logs –flush-logs –master-data=2
> backup_sunday_1_PM.sql

Encoding problems

http://www.hostbulgaria.com/tutorials/mysql-charset-encoding.aspx

SHOW VARIABLES LIKE ‘character_set_%’;
curl -i http://system3.spider.bg

Creating a database

create database re_production DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

Creating a user

GRANT ALL PRIVILEGES ON arachnid_production.* TO ‘payak’@’%’ IDENTIFIED BY ‘payakpassword’ WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON system3_production.* TO ‘payak’@’%’ IDENTIFIED BY ‘payakpassword’ WITH GRANT OPTION;

mysqladmin -u [user] -h localhost -p password ‘[new_password]’

SQL for a table

SHOW CREATE TABLE tblname;

mysql tunel to another machine

ssh -N -f -l root -L 0.0.0.0:3307:91.196.240.132:3306 s1
open port 3307 on the local machine to 91.196.240.132:3306 and login into s1 with root

Replace text

UPDATE script_histories SET cod_script = replace(cod_script,”observer.ArchiveObserver(siteId)”,”observer.ArchiveObserver(siteId, script_id, owned_source_id)”);

Copy from one table to another

DELETE FROM system3_production.articles;
INSERT INTO system3_production.articles SELECT * FROM arachnid_from_screen.articles;

Sessions for Rails

select count(*) from sessions where updated_at < DATE_SUB(now(), INTERVAL 3 DAY);

Binnary loging

http://dev.mysql.com/doc/refman/5.0/en/recovery-from-backups.html

Check this attachment here: mysql-presentation on replication etc.

  • See the status of the log files
    SHOW BINARY LOGS;
    SHOW MASTER STATUS;
  • Clean the binary logs instantly
    RESET MASTER;
  • Clean binary logs to date/name
    PURGE BINARY LOGS TO ”mysqld-bin.00XXXX’;
  • Configurations in my.cnf
    log-bin
    server-id = 1
    expire_logs_days = 1
    max_binlog_size = 100M

Configuration

max_allowed_packet = 50M
wait_timeout=720
max_connections=1000
connect_timeout=20

query_cache_limit=8M #~~~ removed, 1M def. max pozwl. razmer za cache-hirane na edna zajawka
query_cache_size=128M #~~~ 32M, 0 def.
query_cache_type=1

Restoring the maintian Debian User