Jump to content

Trie Data Structure for Autocomplete?


asmint3

Recommended Posts

Hi,

 

I'm implementing an AJAX autocomplete against a fairly hefty database. I started out using "select * from table where name like '%bob%'" but that's running like a dog due to the table size. Apparently the trie data structure is a good way to go to optimise my autocomplete lookup but I haven't found any kind of integration of extension for PHP. If you've no idea what I'm talking about checkout http://en.wikipedia.org/wiki/Trie and http://www.chokkan.org/software/dastrie/ for an implementation.

 

Can anyone advise whether they've seen any use of trie from a php application (Drupal in my case) or any other alternative?

 

Thanks,

Link to comment
Share on other sites

I don't know if it's possible to implement a Trie tree with with MySQL (never tried to use anything besides the typical indexes that MySQL uses).

 

Unfortunately, when you search with something like LIKE '%blah%', indexes become entirely useless, and that is probably where your delay lies.

 

 

You should probably limit search results by the way (or did you just post a snippet?).  What if the query returns 1000 rows lol?

 

By the way, I don't think Trie trees would be able to efficiently search how you're wanting to search anyway, assuming I glanced through the wiki and read stuff right ;p.

 

 

 

Have you considered using bob% instead of %bob%?  That would speed things up considerably if you have indexes on the name field.  I guess you want it where name contains instead of name starts with.

 

Is that so last names work?  Are you doing that so someone can put in John or Smith and John Smith will show up?  If so, you could store name parts in different columns.

 

WHERE first_name LIKE '{$name}%' OR last_name LIKE '{$name}%' OR last_name LIKE '{$name}%'

 

 

 

 

By the way, how big is the table?

Link to comment
Share on other sites

@corbin - thanks. That was just an example, I actually do "where name like 'bob%'" in the real world which is why a trie solution would be much more efficient. Sorry for posting the wrong example! The table is currently ~100k rows and is growing fairly rapidly. The query is slowing as the table grows, it's clearly not a long term solution since it's not going to scale (yes, I do limit the query size as well).

 

@Mark Baker - I think it depends on the site and it's usage. This is a large site and one of the key modes of operation is autocomplete search. Even with a large DB, once a few hundred people start to query it at the same time it doesn't take long for the performance to drop off. From the reading I've done the large sites all seem to have in memory data structures (like a trie) to aid autocomplete.

 

 

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.