Jump to content


Photo

quick retrieval from HUGE database


  • Please log in to reply
4 replies to this topic

#1 redbullmarky

redbullmarky
  • Staff Alumni
  • Advanced Member
  • 2,863 posts
  • LocationBedfordshire, England

Posted 10 June 2006 - 12:18 PM

Hi All

Just fishing for a few ideas on this one, to try and work out the best way.

I have a database that contains over 3 million worldwide cities, complete with the city name, country code, longitude and latitude. What I'm hoping to do is to offer a text field on a site that, as you type, it will offer a suggestion box of matching cities (I'm using AJAX for this).

The way I've thought is to have a 'quick reference' table where the exact matches over a period of time will be added to it, which will be searched first - ie, a table of the most popular entered cities. In practice, this seems to be working very well and has cut the time it takes down significantly. But just wondering - can anyone suggest a better/alternative way of accessing such a huge database via AJAX? Should I use a different database other than mysql? is their a table type better than MyISAM for this sort of thing?

any advice would be appreciated

Cheers
Mark
"you have to keep pissing in the wind to learn how to keep your shoes dry..."

I say old chap, that is rather amusing!

#2 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 10 June 2006 - 04:40 PM

That's a very interesting question.

Ultimately, the speed of query comes down to the effectiveness of the index, how long this index is in memory, and whether or not a scan of portions of the table need to be performed. This is all very dependent on the server configuration, and there are ways to tweak this, ever for just a small number of indexes. Needless to say, if you have a smaller table with a few orders of magnitude fewer data, you'll notice quite a difference (which you have observed). In addition, depending on how much you want to compromise the "accuracy", there are all sorts of shortcuts you can take. Google had an implementation of a "search as you go" as well.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#3 redbullmarky

redbullmarky
  • Staff Alumni
  • Advanced Member
  • 2,863 posts
  • LocationBedfordshire, England

Posted 10 June 2006 - 06:57 PM

[!--quoteo(post=382260:date=Jun 10 2006, 05:40 PM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jun 10 2006, 05:40 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
That's a very interesting question.

Ultimately, the speed of query comes down to the effectiveness of the index, how long this index is in memory, and whether or not a scan of portions of the table need to be performed. This is all very dependent on the server configuration, and there are ways to tweak this, ever for just a small number of indexes. Needless to say, if you have a smaller table with a few orders of magnitude fewer data, you'll notice quite a difference (which you have observed). In addition, depending on how much you want to compromise the "accuracy", there are all sorts of shortcuts you can take. Google had an implementation of a "search as you go" as well.
[/quote]

Sure. Only problem is, the actual city data comes in two types - lame and unusable or huge and concise. There doesnt seem to be a middle ground, and the lame+unusable is not an option, so really it has to be some form of workaround for the huge database.
There is no 'index' in the normal sense in the DB (ie, auto inc/primary key) - it literally has 4 fields - country (2 char code), city (varchar 20), longitude and latitude (both DOUBLE).
queries to the database are always based on city and country, eg SELECT * FROM lookup_cities WHERE city LIKE '$city%' AND country = '$country' where city and country are two php vars inserted into the query when the script is called by ajax.

If you have any other ideas on how you personally would do it if you were stuck with 3million records, please share :)

Cheers
Mark
"you have to keep pissing in the wind to learn how to keep your shoes dry..."

I say old chap, that is rather amusing!

#4 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 10 June 2006 - 08:17 PM

[!--quoteo(post=382290:date=Jun 10 2006, 02:57 PM:name=redbullmarky)--][div class=\'quotetop\']QUOTE(redbullmarky @ Jun 10 2006, 02:57 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Sure. Only problem is, the actual city data comes in two types - lame and unusable or huge and concise. There doesnt seem to be a middle ground, and the lame+unusable is not an option, so really it has to be some form of workaround for the huge database.
There is no 'index' in the normal sense in the DB (ie, auto inc/primary key) - it literally has 4 fields - country (2 char code), city (varchar 20), longitude and latitude (both DOUBLE).
queries to the database are always based on city and country, eg SELECT * FROM lookup_cities WHERE city LIKE '$city%' AND country = '$country' where city and country are two php vars inserted into the query when the script is called by ajax.

If you have any other ideas on how you personally would do it if you were stuck with 3million records, please share :)

Cheers
Mark
[/quote]
Well, you must be using indexes -- a multi-column index on, say, city/country, I suppose? Plus, you must be using a LIMIT clause, too....
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#5 redbullmarky

redbullmarky
  • Staff Alumni
  • Advanced Member
  • 2,863 posts
  • LocationBedfordshire, England

Posted 11 June 2006 - 09:44 AM

[!--quoteo(post=382319:date=Jun 10 2006, 09:17 PM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ Jun 10 2006, 09:17 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Well, you must be using indexes -- a multi-column index on, say, city/country, I suppose? Plus, you must be using a LIMIT clause, too....
[/quote]

Hi
yeah LIMIT was already in there and now i've extended the index across columns and combined it with the other way i had, the whole thing is pretty instantaneous now!

Cheers
Mark
"you have to keep pissing in the wind to learn how to keep your shoes dry..."

I say old chap, that is rather amusing!




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users