Jump to content

Recommended Posts

It's not nearly that simple.

 

I think you could learn more from google, but here's a short explanation of the pros and cons of the two most widely used MySQL engines, InnoDB and MyISAM.

 

 

----InnoDB---

--Pros--

-Row level locking (can have concurrent INSERTs/UPDATEs as long as it's not the same row)

-Better data recoverability

-Handles unexpected shutdowns better

--Cons--

-Slower extraction

-No fulltext indexes

 

---MyISAM---

--Pros--

-Faster data extraction

-fulltext indexes

 

--Cons--

-Table level locking  (Reads can be done concurrently, but altercations are queued linearly)

-Data is more prone to loss (although usually this is not a significant risk)

 

 

Basically for data without a fulltext search need, if concurrency of insert/updates is needed, InnoDB is your best option.

 

If there will be few writes and a lot of reads, MyISAM is your best bet.

 

If fulltext searching is needed, MyISAM is your only option in MySQL (I hear that the Falcon engine in MySQL 6 handles fulltext well, but no idea how accurate that is or not).

If I were to use InnoDB, what would a related query to this be since InnoDB doesn't support full-text searches?

 

SELECT phrase,count,
MATCH(phrase) AGAINST ('$q' IN BOOLEAN MODE) AS score
FROM keyWords WHERE MATCH(phrase) AGAINST ('$q' IN BOOLEAN MODE) AND
phrase != '$q'
ORDER BY score DESC, count DESC LIMIT 15

If I were to use InnoDB, what would a related query to this be since InnoDB doesn't support full-text searches?

 

SELECT phrase,count,
MATCH(phrase) AGAINST ('$q' IN BOOLEAN MODE) AS score
FROM keyWords WHERE MATCH(phrase) AGAINST ('$q' IN BOOLEAN MODE) AND
phrase != '$q'
ORDER BY score DESC, count DESC LIMIT 15

 

That would be a reason why you would use MyISAM. Corbin, as far as MyISAM having faster data extraction, that all depends on your schema and your use of indexes. I get much better performance from InnoDB than MyISAM in a table with over 10 million records and growing by the day. MyISAM is also very prone to corruption and is just such a hassle to deal with. Also, if you want to have replication, you'll need InnoDB. Like I said unless you have a reason for MyISAM(like fulltext searches) then use InnoDB.

Also, if you want to have replication, you'll need InnoDB.

 

You're sure about that? Or do you mean transactions?

AFAIK replication is available for MyISAM, although with table level locking it might be performing poorly

I mean you could use MyISAM, but you also could jump off a bridge. Just because it's possible, doesn't make it a valid alternative. So I would not recommend it. For those who don't know much about replication, a simple answer of me saying , "yes its possible with both myisam and innodb" could lead someone to screw up their entire project. That's why I suggest InnoDB for replication.

Yeah, I must agree with the replication point.  I should have thought about that actually.  Semi-side note:  Does MyISAM have a binary log like InnoDB?  Guessing no.

If I were to use InnoDB, what would a related query to this be since InnoDB doesn't support full-text searches?

 

 

SELECT phrase,count,
MATCH(phrase) AGAINST ('$q' IN BOOLEAN MODE) AS score
FROM keyWords WHERE MATCH(phrase) AGAINST ('$q' IN BOOLEAN MODE) AND
phrase != '$q'
ORDER BY score DESC, count DESC LIMIT 15

 

You would have to use LIKE clauses that could not use indexes, which would cause entire table scans.  Even worse, if the table were large, there would probably be a huge amount of disk I/O.

That would be a reason why you would use MyISAM. Corbin, as far as MyISAM having faster data extraction, that all depends on your schema and your use of indexes. I get much better performance from InnoDB than MyISAM in a table with over 10 million records and growing by the day. MyISAM is also very prone to corruption and is just such a hassle to deal with. Also, if you want to have replication, you'll need InnoDB. Like I said unless you have a reason for MyISAM(like fulltext searches) then use InnoDB.

 

Just of curiosity, why do you think InnoDB is faster?  I've never heard of a query that properly uses indexes in both MyISAM and InnoDB being faster in InnoDB.  Perhaps some memory settings would need to be tweaked.  For example, MyISAM index searches are a lot faster if the entire index is in memory for the table.  Or maybe InnoDB just handles some types of indexes better or something.

 

As far as data loss, I think that problem is over exaggerated by most.  Yeah, MyISAM is quite sensitive.  If you shut MySQL down improperly, you're supposed to repair tables, backups are a bit more difficult, so on...  But I still don't think that data loss because of MyISAM is that common.  In my experience, MySQL is pretty good about doing the maintenance on MyISAM that people don't realize that they need to do.  Note that I'm not saying that data loss isn't a lot more common, I just feel that "very prone" is a bit overstating it.

 

Oh also with your InnoDB table, on MyISAM tables, locking can happen at all kinds of weird times (for example, a long select can actually hold up other selects if a write is pending in between the reads).  So perhaps InnoDB is faster due to locking stuff.

To me full-text search just seem super slow.

 

I have a table with 16k rows, the table size is about 100MB, and for me it just seems very sluggish...

 

After a test, I found my reason for the sluggishness, it was because I am doing the query twice, once with the limit and once without the limit using SQL_CALC_FOUND_ROWS:

 

$query_count = sprintf("SELECT SQL_CACHE SQL_CALC_FOUND_ROWS
title,
content,
URL,
id
FROM %s
WHERE
	MATCH(
		`URL`,
		`title`,
		`content`
	) AGAINST ('%s' IN BOOLEAN MODE) AND
	content IS NOT NULL
LIMIT $limitvalue, $limit",
mysql_real_escape_string($searchType),
mysql_real_escape_string($searchQuery));

$sql = mysql_query($query_count)or die(mysql_error()); 
$result_count = mysql_query("SELECT FOUND_ROWS()")or die(mysql_error());
$total = mysql_fetch_array($result_count);
$totalrows = $total[0];

 

Is there a workaround to this? This really slows down my query on 16k rows, and soon a row increase.

Binary log is handled by MySQL not by storage engines.

As far as efficency goes, besides clearly outdated sources saying InnoDB is so slow it's useless, I could only find benchmarks where both engines score pretty the same as long as number of concurrent threads is not above number of availabe CPU cores. Above that MyISAM seems to scale a bit better. Guys at Percona have done a few such comparisons (and have also thrown their own PBXT storage engine into mix). Good reading.

 

[added]

 

Oh yeah. And everytime I see my site down because one of my tables got corrupted, I want to kick my webhost's ass for disabling InnoDB. Hopefully I'll be moving away from them really soon.

Your webhost disabled InnoDB?  Lame.

 

 

Perhaps the reason that I've never had many issues with MyISAM corruption is that I rarely ever use MyISAM unless I need fulltext indexes.  As for the speed thing, I'm probably also thinking of old benchmarks.

 

 

So in that case, ignore everything I said about MyISAM pretty much ;p.

 

 

Hrmmm, The Little Guy, this is a random hunch, but try removing content IS NOT NULL and see if the speed gets any better.

 

 

Edit:  Nevermind.  Based on some ghetto benchmarking, it's definitely the SQL_CALC... I thought maybe there wasn't an index on content that could be used to check null-ness and that causing it.

 

 

 

Perhaps the reason that I've never had many issues with MyISAM corruption is that I rarely ever use MyISAM unless I need fulltext indexes. 

MyISAM has a bad habit of becoming corrupt from overuse. If you delete too many rows the table becomes unusable. A database table should never become corrupt from be used "too much". Also, I had a nasty index corruption involving dates which left all of my queries involving dates to return zero rows. MyISAM needs work, InnoDB is an industry standard.

SQL_CALC_ROWS does not run your query twice, It does however run it without LIMIT and then cuts the resultset down.

16k rows is not that much, try running EXPLAIN on this query and see if indexes are used wher possible.

 

AS to your second question, I've no idea... where do you see it exactly?

 

 

I think I did explain correctly...

 

EXPLAIN webSearch

 

Result:

Field		Type		Null	Key	Default	Extra
id		bigint(20)	NO	PRI	NULL	auto_increment
URL		varchar(255)	NO	MUL	NULL	 
title		varchar(255)	NO	 	NULL	 
content		text		YES	 	NULL	 
dateAdd		timestamp	NO	 	CURRENT_TIMESTAMP	 
lastUpdate	timestamp	NO	 	0000-00-00 00:00:00	 
clickCount	bigint(20)	NO	 	0	 
lang		varchar(5)	NO	 	X	 

 

otherwise if I do this:

 

EXPLAIN SELECT SQL_CACHE SQL_CALC_FOUND_ROWS title, content, URL, id
FROM webSearch
WHERE 
MATCH (
`URL` ,  `title` ,  `content`
)
AGAINST (
'free'
)
AND content IS NOT NULL 
LIMIT 0 , 10

 

Result:

id	select_type	table		type		possible_keys	key	key_len		ref	rows	Extra
1	SIMPLE		webSearch	fulltext	URL		URL	0	 		1	Using where

SQL_CALC_ROWS does not run your query twice, It does however run it without LIMIT and then cuts the resultset down.

16k rows is not that much, try running EXPLAIN on this query and see if indexes are used wher possible.

 

How do I know if indexes are correct?

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

 

Did you create FULLTEXT indexes on all columns you do fulltext search against?

 

yeah I did...

 

I have all three searched columns indexed in one index, and I have them all indexed separately as well. I also changed my query too look like this:

 

SELECT SQL_CACHE
id,
URL,
title,
content,
MATCH(`title`) AGAINST ('%s') as score1,
MATCH(`content`) AGAINST ('%s') as score2,
MATCH(`URL`) AGAINST ('%s') as score3
FROM %s
WHERE
	MATCH(
		`URL`,
		`title`,
		`content`
	) AGAINST ('%s') AND
	content IS NOT NULL
	ORDER BY score1 DESC, score2 DESC, score3 DESC
	LIMIT $limitvalue, $limit

 

those MATCH ... AGAINST statements in the select, is that bad to have that many, including the one in the WHERE clause?

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

 

Did you create FULLTEXT indexes on all columns you do fulltext search against?

 

yeah I did...

 

I have all three searched columns indexed in one index, and I have them all indexed separately as well. I also changed my query too look like this:

 

SELECT SQL_CACHE
id,
URL,
title,
content,
MATCH(`title`) AGAINST ('%s') as score1,
MATCH(`content`) AGAINST ('%s') as score2,
MATCH(`URL`) AGAINST ('%s') as score3
FROM %s
WHERE
	MATCH(
		`URL`,
		`title`,
		`content`
	) AGAINST ('%s') AND
	content IS NOT NULL
	ORDER BY score1 DESC, score2 DESC, score3 DESC
	LIMIT $limitvalue, $limit

 

those MATCH ... AGAINST statements in the select, is that bad to have that many, including the one in the WHERE clause?

 

It will be slower. I dont have much experience with text searches, but what I do know is that a multiple column index is only useful if you are A. using WHERE with the first column in the index, B. using WHERE with the first column and the second column in that specific order and so on with more columns C. mysql only uses the best index, or a join of indexes. having an index on all three columns and having a separate index on one of the columns does not help you in your situation. your best best would be a multiple column index in the order of (url,title,content) and something similar for your order by clause.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.