Jump to content


Photo

help with huge table


  • Please log in to reply
10 replies to this topic

#1 eric1235711

eric1235711
  • Members
  • PipPipPip
  • Advanced Member
  • 107 posts
  • LocationSão Paulo - Brazil

Posted 03 October 2006 - 12:55 PM

hello,

I got a table with more than 500.000 records. It´s a list of adresses and postal-codes (I don´t know if this is the right word, here it cals CEP)

I find that this table overloaded the server, a simple query in it was pocessing a long time... I find that it hapened when a lot of people accessed it at same time...


Do you think that this table is too large?

In all the system there is an only one simple query that calls this table, looking for the address of a CEP number. This CEP field is a key.

Now the server looks like running normaly, but I´m thinking about hashing this table... in ten or a hundred tables...

What do you think?
Programming is like building your own universe...
But let me ask you something:
Your world... It´s cool? Or it sucks?

#2 obsidian

obsidian
  • Staff Alumni
  • Advanced Member
  • 3,202 posts
  • LocationSeattle, WA

Posted 03 October 2006 - 01:05 PM

500,000 records shouldn't be too much of a load for a simple query if the table and the query both are properly optimized (well, for a good server, that is). if your table can be normalized further into wider categorization, that would be good. if you need some input on that aspect, just post your table schema and we may be able to help with that, too.
You can't win, you can't lose, you can't break even... you can't even get out of the game.

<?php
while (count($life->getQuestions()) > 0)
{   $life->study(); } ?>
  LINKS: PHP: Manual MySQL: Manual PostgreSQL: Manual (X)HTML: Validate It! CSS: A List Apart | IE bug fixes | Zen Garden | Validate It! JavaScript: Reference Cards RegEx: Everything RegEx

#3 eric1235711

eric1235711
  • Members
  • PipPipPip
  • Advanced Member
  • 107 posts
  • LocationSão Paulo - Brazil

Posted 03 October 2006 - 01:18 PM

the query is very simple, eg: SELECT * FROM addresses WHERE cep = '01001000'

It always returns a single row.

a shema is this?

Field Type Null Default
ID  int(11) Yes  NULL 
TYPE varchar(20) Yes  NULL 
ADDRESS varchar(60) Yes  NULL 
NUMBSTRT varchar(8) Yes  NULL 
NUMBEND varchar(8) Yes  NULL 
SIDE varchar(1) Yes  NULL 
SUBURB varchar(50) Yes  NULL 
CEP  int(8) Yes  NULL 
CITY varchar(38) Yes  NULL 
CITYID varchar(5) Yes  NULL 
STATE varchar(2) Yes  NULL 
ESPECIAL  varchar(70) Yes  NULL 

Indexes:
Keyname Type Cardinality Field
PRIMARY  PRIMARY  559819  ID 
CEP  INDEX  559819  CEP 

Space usage:
Type Usage
Data 39,707 KB
Index 11,797 KB
Total 51,504 KB

    Row Statistics:
Statements Value
Format dynamic 
Rows 559,819 
Row length ø 72 
Row size  ø 94 Bytes 
Next Autoindex 559,820 
Creation Oct 02, 2006 at 06:30 PM 
Last update Oct 02, 2006 at 06:30 PM 
Last check Oct 02, 2006 at 06:30 PM 



Programming is like building your own universe...
But let me ask you something:
Your world... It´s cool? Or it sucks?

#4 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 03 October 2006 - 03:02 PM

500K is not a big deal... there are a few things you can do to speed things up, though. 

First, fixed-width tables are faster to query.. consider changing your VARCHARs to CHARs. 

Second, the index is very large -- you can make it smaller by (A) having the CEP serve as the PK (especially if you ZEROFILL it, it can stay an INT) and then (B) use a short prefix.  If you can load the index into memory, it will be much faster to query.

Third, if you don't need back * always, you can make a multi-column index, and never have to read the table for the one column you need back, which will be really fast.

Of course, splitting the table horizontally may also be required, but I wouldn't suggest it just yet.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#5 eric1235711

eric1235711
  • Members
  • PipPipPip
  • Advanced Member
  • 107 posts
  • LocationSão Paulo - Brazil

Posted 03 October 2006 - 05:42 PM

Fenway,

you said to (B) use a shot prefix, you mean that I shoud make a column of (eg) the two (or three) first digits of CEP?
Programming is like building your own universe...
But let me ask you something:
Your world... It´s cool? Or it sucks?

#6 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 03 October 2006 - 08:47 PM

Well, since it's 5.4 postal codes, I'd say 5 would be a good length to choose ;-)
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#7 eric1235711

eric1235711
  • Members
  • PipPipPip
  • Advanced Member
  • 107 posts
  • LocationSão Paulo - Brazil

Posted 04 October 2006 - 12:30 PM

5.3 ...
Programming is like building your own universe...
But let me ask you something:
Your world... It´s cool? Or it sucks?

#8 eric1235711

eric1235711
  • Members
  • PipPipPip
  • Advanced Member
  • 107 posts
  • LocationSão Paulo - Brazil

Posted 04 October 2006 - 12:58 PM

I saw that what was overloading is that list of Sleeping processes...
Programming is like building your own universe...
But let me ask you something:
Your world... It´s cool? Or it sucks?

#9 Vikas Jayna

Vikas Jayna
  • Members
  • PipPipPip
  • Advanced Member
  • 121 posts
  • LocationNoida, Delhi, India

Posted 04 October 2006 - 03:48 PM

If its the number of sleeping processes is large then you may have to check the wait_timeout variable in mysql and reduce it if its set too large. Ideally, if there are too many waiting processes then the problem should not be with this query but establishing the mysql connection. Are you getting the 'Too many connections' error?
Vikas Jayna,
Project Manager - Jeevansathi.com
7 yrs. exp. in LAMP
Certified Mysql DBA
Zend Certified Engineer

#10 eric1235711

eric1235711
  • Members
  • PipPipPip
  • Advanced Member
  • 107 posts
  • LocationSão Paulo - Brazil

Posted 05 October 2006 - 07:11 PM

Are you getting the 'Too many connections' error?

yes, some times...
Programming is like building your own universe...
But let me ask you something:
Your world... It´s cool? Or it sucks?

#11 eric1235711

eric1235711
  • Members
  • PipPipPip
  • Advanced Member
  • 107 posts
  • LocationSão Paulo - Brazil

Posted 05 October 2006 - 07:12 PM

I made a "serial-killer" program... lol

it kills the processes that sleeps too much
Programming is like building your own universe...
But let me ask you something:
Your world... It´s cool? Or it sucks?




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users