Jump to content

quick retrieval from HUGE database


redbullmarky

Recommended Posts

Hi All

Just fishing for a few ideas on this one, to try and work out the best way.

I have a database that contains over 3 million worldwide cities, complete with the city name, country code, longitude and latitude. What I'm hoping to do is to offer a text field on a site that, as you type, it will offer a suggestion box of matching cities (I'm using AJAX for this).

The way I've thought is to have a 'quick reference' table where the exact matches over a period of time will be added to it, which will be searched first - ie, a table of the most popular entered cities. In practice, this seems to be working very well and has cut the time it takes down significantly. But just wondering - can anyone suggest a better/alternative way of accessing such a huge database via AJAX? Should I use a different database other than mysql? is their a table type better than MyISAM for this sort of thing?

any advice would be appreciated

Cheers
Mark
Link to comment
Share on other sites

That's a very interesting question.

Ultimately, the speed of query comes down to the effectiveness of the index, how long this index is in memory, and whether or not a scan of portions of the table need to be performed. This is all very dependent on the server configuration, and there are ways to tweak this, ever for just a small number of indexes. Needless to say, if you have a smaller table with a few orders of magnitude fewer data, you'll notice quite a difference (which you have observed). In addition, depending on how much you want to compromise the "accuracy", there are all sorts of shortcuts you can take. Google had an implementation of a "search as you go" as well.
Link to comment
Share on other sites

[!--quoteo(post=382260:date=Jun 10 2006, 05:40 PM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jun 10 2006, 05:40 PM) [snapback]382260[/snapback][/div][div class=\'quotemain\'][!--quotec--]
That's a very interesting question.

Ultimately, the speed of query comes down to the effectiveness of the index, how long this index is in memory, and whether or not a scan of portions of the table need to be performed. This is all very dependent on the server configuration, and there are ways to tweak this, ever for just a small number of indexes. Needless to say, if you have a smaller table with a few orders of magnitude fewer data, you'll notice quite a difference (which you have observed). In addition, depending on how much you want to compromise the "accuracy", there are all sorts of shortcuts you can take. Google had an implementation of a "search as you go" as well.
[/quote]

Sure. Only problem is, the actual city data comes in two types - lame and unusable or huge and concise. There doesnt seem to be a middle ground, and the lame+unusable is not an option, so really it has to be some form of workaround for the huge database.
There is no 'index' in the normal sense in the DB (ie, auto inc/primary key) - it literally has 4 fields - country (2 char code), city (varchar 20), longitude and latitude (both DOUBLE).
queries to the database are always based on city and country, eg SELECT * FROM lookup_cities WHERE city LIKE '$city%' AND country = '$country' where city and country are two php vars inserted into the query when the script is called by ajax.

If you have any other ideas on how you personally would do it if you were stuck with 3million records, please share :)

Cheers
Mark
Link to comment
Share on other sites

[!--quoteo(post=382290:date=Jun 10 2006, 02:57 PM:name=redbullmarky)--][div class=\'quotetop\']QUOTE(redbullmarky @ Jun 10 2006, 02:57 PM) [snapback]382290[/snapback][/div][div class=\'quotemain\'][!--quotec--]
Sure. Only problem is, the actual city data comes in two types - lame and unusable or huge and concise. There doesnt seem to be a middle ground, and the lame+unusable is not an option, so really it has to be some form of workaround for the huge database.
There is no 'index' in the normal sense in the DB (ie, auto inc/primary key) - it literally has 4 fields - country (2 char code), city (varchar 20), longitude and latitude (both DOUBLE).
queries to the database are always based on city and country, eg SELECT * FROM lookup_cities WHERE city LIKE '$city%' AND country = '$country' where city and country are two php vars inserted into the query when the script is called by ajax.

If you have any other ideas on how you personally would do it if you were stuck with 3million records, please share :)

Cheers
Mark
[/quote]
Well, you must be using indexes -- a multi-column index on, say, city/country, I suppose? Plus, you must be using a LIMIT clause, too....
Link to comment
Share on other sites

[!--quoteo(post=382319:date=Jun 10 2006, 09:17 PM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jun 10 2006, 09:17 PM) [snapback]382319[/snapback][/div][div class=\'quotemain\'][!--quotec--]
Well, you must be using indexes -- a multi-column index on, say, city/country, I suppose? Plus, you must be using a LIMIT clause, too....
[/quote]

Hi
yeah LIMIT was already in there and now i've extended the index across columns and combined it with the other way i had, the whole thing is pretty instantaneous now!

Cheers
Mark
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.