Jump to content

Archived

This topic is now archived and is closed to further replies.

asmint3

Trie Data Structure for Autocomplete?

Recommended Posts

Hi,

 

I'm implementing an AJAX autocomplete against a fairly hefty database. I started out using "select * from table where name like '%bob%'" but that's running like a dog due to the table size. Apparently the trie data structure is a good way to go to optimise my autocomplete lookup but I haven't found any kind of integration of extension for PHP. If you've no idea what I'm talking about checkout http://en.wikipedia.org/wiki/Trie and http://www.chokkan.org/software/dastrie/ for an implementation.

 

Can anyone advise whether they've seen any use of trie from a php application (Drupal in my case) or any other alternative?

 

Thanks,

Share this post


Link to post
Share on other sites

I don't know if it's possible to implement a Trie tree with with MySQL (never tried to use anything besides the typical indexes that MySQL uses).

 

Unfortunately, when you search with something like LIKE '%blah%', indexes become entirely useless, and that is probably where your delay lies.

 

 

You should probably limit search results by the way (or did you just post a snippet?).  What if the query returns 1000 rows lol?

 

By the way, I don't think Trie trees would be able to efficiently search how you're wanting to search anyway, assuming I glanced through the wiki and read stuff right ;p.

 

 

 

Have you considered using bob% instead of %bob%?  That would speed things up considerably if you have indexes on the name field.  I guess you want it where name contains instead of name starts with.

 

Is that so last names work?  Are you doing that so someone can put in John or Smith and John Smith will show up?  If so, you could store name parts in different columns.

 

WHERE first_name LIKE '{$name}%' OR last_name LIKE '{$name}%' OR last_name LIKE '{$name}%'

 

 

 

 

By the way, how big is the table?

Share this post


Link to post
Share on other sites

surely for autocomplete you should be using "select * from table where name like 'bob%'"

Share this post


Link to post
Share on other sites

@corbin - thanks. That was just an example, I actually do "where name like 'bob%'" in the real world which is why a trie solution would be much more efficient. Sorry for posting the wrong example! The table is currently ~100k rows and is growing fairly rapidly. The query is slowing as the table grows, it's clearly not a long term solution since it's not going to scale (yes, I do limit the query size as well).

 

@Mark Baker - I think it depends on the site and it's usage. This is a large site and one of the key modes of operation is autocomplete search. Even with a large DB, once a few hundred people start to query it at the same time it doesn't take long for the performance to drop off. From the reading I've done the large sites all seem to have in memory data structures (like a trie) to aid autocomplete.

 

 

 

 

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.