Jump to content

Fastest COUNT plus data method?


joeygallo

Recommended Posts

I need to COUNT then get data if rows exist.  Something like this:

$something = mysql_query("select COUNT(*) from table");
if (mysql_result($something,0) > 0) {
$qry = mysql_query("select field1, field2, field3 from table");
$result = mysql_fetch_array($qry);
echo $qry['field1'];
// etc. 
} else {
echo "no data found";
}

 Is the above the best way to do this or can the two queries be combined?  I'm looking for the fastest possible method as we'll be dealing with a table that will eventually have 50 million+ rows.

 

Thank you.

Edited by joeygallo
Link to comment
Share on other sites

If it'll have that much data then you don't need to bother with a count, right?

 

Not exactly.  We still need to display the count number.  We found X number of results.. Using mysql_num_rows is not an option with that many records.

Link to comment
Share on other sites

Not exactly.  We still need to display the count number.  We found X number of results.. Using mysql_num_rows is not an option with that many records.

 

Also, the better answer would be select COUNT(*) from table where field1 = 'xyz'.  There we have 50 million rows in the table but maybe not for the specific search.  This is the primary reason COUNT is still needed.

Link to comment
Share on other sites

With that WHERE the question makes more sense.

 

If you need accuracy then a SELECT COUNT(*) will do it, but make sure you have indexes on the fields you're using. If you don't need an exact figure an EXPLAIN SELECT may give you numbers close enough.

Link to comment
Share on other sites

You could always use the "num_rows" property/function of the SQL library you're using.

For MySQLI it's mysqli_result::$num_rows and for PDO it's PDOStatement::rowCount. Unfortuantely PDO doesn't guarantee it's available for all RDBM systems, but it should be available for MySQL.

...Which would require querying for and downloading all howevermanymillion rows. No thanks.
Link to comment
Share on other sites

From the opening post it sounds like he's going to get them anyway once he's counted how many there are

He said the table will have millions of rows. Unless there's going to be a very specific WHERE, there'll be some amount of pagination.
Link to comment
Share on other sites

Sorry for the long reply. There is a "WHERE".

 

select COUNT (*) from ip_logs where ip = '123.123.123.123'

 

IF FOUND:

 

select field1, field2, field3 from ip_logs where ip = '123.123.123.123'

 

In this query, I just need to know if it exists, not an actual count.  If so, move on to the next query to get data from that row.  So my question is.. what is the fastest possible method?  I assume the two can be combined but my concern is speed.

Link to comment
Share on other sites

I did my own benchmarking on a 200K record table

$t1 = microtime(1);

$sql = "SELECT COUNT(*) FROM votes WHERE type=2";
$res = $db->query($sql);
$row = $res->fetch_row();
$count = $row[0];
if ($count > 0) {
    $res = $db->query("SELECT * FROM votes WHERE type=2");
}

$t2 = microtime(1);

$sql = "SELECT * FROM votes WHERE type=2";
$res = $db->query($sql);
$count = $res->num_rows;

$t3 = microtime(1);

printf("Rows found : %d<br>Method A : %0.4f<br>Method B : %0.4f", $count, $t2-$t1, $t3-$t2);

/******* RESULTS *****************************
Rows found : 0
Method A : 0.0009
Method B : 0.0008

Rows found : 9278
Method A : 0.0582
Method B : 0.0246 
**********************************************/
Link to comment
Share on other sites

select COUNT (*) from ip_logs where ip = '123.123.123.123'

If you are actually going to be storing millions of IP Addresses, you may want to look at mySql's INET_ATON() and INET_NTOA() functions. These convert the dotted-decimal IPv4 address to a single 4-byte unsigned integer. Instead of storing 7 to 17 characters for each address, you store 4 bytes each. Your indexes will also be smaller.

Link to comment
Share on other sites

 

I did my own benchmarking on a 200K record table

$t1 = microtime(1);

$sql = "SELECT COUNT(*) FROM votes WHERE type=2";
$res = $db->query($sql);
$row = $res->fetch_row();
$count = $row[0];
if ($count > 0) {
    $res = $db->query("SELECT * FROM votes WHERE type=2");
}

$t2 = microtime(1);

$sql = "SELECT * FROM votes WHERE type=2";
$res = $db->query($sql);
$count = $res->num_rows;

$t3 = microtime(1);

printf("Rows found : %d<br>Method A : %0.4f<br>Method B : %0.4f", $count, $t2-$t1, $t3-$t2);

/******* RESULTS *****************************
Rows found : 0
Method A : 0.0009
Method B : 0.0008

Rows found : 9278
Method A : 0.0582
Method B : 0.0246 
**********************************************/

Thanks.  Method A is what we have now.  I think it sounds like we just need some indexing done to the database.  50 million rows is basically a timeout.  Indexing will likely fix it.

Link to comment
Share on other sites

The results were pretty much what I expected. (Times above in seconds)

 

Not much in it if no records are found but, once you find records, method B is 2 times faster than A.

 

Method A reads the records twice (count them, fetch them) whereas B counts them while it is fetching them.

Edited by Barand
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.