The Little Guy Posted January 16, 2009 Share Posted January 16, 2009 I am trying to do a query on a table that is ~140MB in size, containing nearly 400,000 rows. What I would like to do is try to optimize this: $sql1 = mysql_query(sprintf("SELECT COUNT(DISTINCT ip) as uniqueCount FROM stats WHERE owner_id = '%s' AND project_id = '%s' AND `date` >= DATE_SUB(CURDATE(),INTERVAL 1 MONTH) GROUP BY project_id", mysql_real_escape_string($row['owner_id']), mysql_real_escape_string($row['id'])))or die(mysql_error()); $sql2 = mysql_query(sprintf("SELECT COUNT(*) as totalCount FROM stats WHERE owner_id = '%s' AND project_id = '%s' AND `date` >= DATE_SUB(CURDATE(),INTERVAL 1 MONTH) GROUP BY project_id", mysql_real_escape_string($row['owner_id']), mysql_real_escape_string($row['id']))); $row1 = array_merge(mysql_fetch_array($sql1), mysql_fetch_array($sql2)); $sql1 is getting a count of all the individual ip addresses within the last month $sql2 is getting a count of all the ip addresses (duplicates included) within the last month If I were to put this in a loop and run it 4 times each with a different ower_id and different project_id, it takes about 1 minute, which to me seems too slow, so how can I optimize this? Quote Link to comment Share on other sites More sharing options...
RussellReal Posted January 16, 2009 Share Posted January 16, 2009 $sql1 = mysql_query(sprintf("SELECT COUNT(DISTINCT ip) as uniqueCount, COUNT(ip) As allIps FROM stats WHERE owner_id = '%s' AND project_id = '%s' AND `date` >= DATE_SUB(CURDATE(),INTERVAL 1 MONTH) GROUP BY project_id", mysql_real_escape_string($row['owner_id']), mysql_real_escape_string($row['id'])))or die(mysql_error()); u could try this but I dont know if it'd work ;P Quote Link to comment Share on other sites More sharing options...
premiso Posted January 16, 2009 Share Posted January 16, 2009 One way is to not do count(*) instead do count(ip) it should speed up the SQL. Other than that, no clue. EDIT: Reason being is counting 1 column is more efficient then counting all columns. Depending how many columns you have that could really speed it up. Anyhow that is just better practice. Quote Link to comment Share on other sites More sharing options...
Mark Baker Posted January 16, 2009 Share Posted January 16, 2009 Russell's statement to return both values in a single query should work. And I don't see why you have a GROUP BY clause. There's no need for it because you're not returning project_id, but it would add overhead when executing the query Quote Link to comment Share on other sites More sharing options...
The Little Guy Posted January 16, 2009 Author Share Posted January 16, 2009 $sql1 = mysql_query(sprintf("SELECT COUNT(DISTINCT ip) as uniqueCount, COUNT(ip) As allIps FROM stats WHERE owner_id = '%s' AND project_id = '%s' AND `date` >= DATE_SUB(CURDATE(),INTERVAL 1 MONTH) GROUP BY project_id", mysql_real_escape_string($row['owner_id']), mysql_real_escape_string($row['id'])))or die(mysql_error()); u could try this but I dont know if it'd work ;P This is what it said for just one query, and for not multiple: Showing rows 0 - 0 (1 total, Query took 10.2801 sec) Quote Link to comment Share on other sites More sharing options...
premiso Posted January 16, 2009 Share Posted January 16, 2009 $sql1 = mysql_query(sprintf("SELECT COUNT(DISTINCT ip) as uniqueCount, COUNT(ip) As allIps FROM stats WHERE owner_id = '%s' AND project_id = '%s' AND `date` >= DATE_SUB(CURDATE(),INTERVAL 1 MONTH) GROUP BY project_id", mysql_real_escape_string($row['owner_id']), mysql_real_escape_string($row['id'])))or die(mysql_error()); u could try this but I dont know if it'd work ;P This is what it said for just one query, and for not multiple: Showing rows 0 - 0 (1 total, Query took 10.2801 sec) For that many rows with doing it multiple times, I would say you are sort of stuck up a tree. The query times will vary depending on how many ips the owner/project has. Limit it to a certain number, or cache the results and update it every x hours in the backend with CRON. Doubt it helps, but yea. EDIT: Thinking about it, why do multiple loops? Why not use the IN() keyword in mysql and pull it up where projects are IN that id? Or owners are IN that id then just use the loop and display all the projects for an owner? A single query would speed it up a ton. Quote Link to comment Share on other sites More sharing options...
The Little Guy Posted January 16, 2009 Author Share Posted January 16, 2009 AWESOME! I speed it up 9 seconds! I added the IP and date to my index list! now it takes ~1.0784 seconds or so for one query! Any other optimization tips would be great! Quote Link to comment Share on other sites More sharing options...
The Little Guy Posted January 16, 2009 Author Share Posted January 16, 2009 $sql1 = mysql_query(sprintf("SELECT COUNT(DISTINCT ip) as uniqueCount, COUNT(ip) As allIps FROM stats WHERE owner_id = '%s' AND project_id = '%s' AND `date` >= DATE_SUB(CURDATE(),INTERVAL 1 MONTH) GROUP BY project_id", mysql_real_escape_string($row['owner_id']), mysql_real_escape_string($row['id'])))or die(mysql_error()); u could try this but I dont know if it'd work ;P This is what it said for just one query, and for not multiple: Showing rows 0 - 0 (1 total, Query took 10.2801 sec) For that many rows with doing it multiple times, I would say you are sort of stuck up a tree. The query times will vary depending on how many ips the owner/project has. Limit it to a certain number, or cache the results and update it every x hours in the backend with CRON. Doubt it helps, but yea. EDIT: Thinking about it, why do multiple loops? Why not use the IN() keyword in mysql and pull it up where projects are IN that id? Or owners are IN that id then just use the loop and display all the projects for an owner? A single query would speed it up a ton. This is the cron, I am trying to speed up my cron Quote Link to comment Share on other sites More sharing options...
premiso Posted January 16, 2009 Share Posted January 16, 2009 $sql1 = mysql_query(sprintf("SELECT COUNT(DISTINCT ip) as uniqueCount, COUNT(ip) As allIps FROM stats WHERE owner_id = '%s' AND project_id = '%s' AND `date` >= DATE_SUB(CURDATE(),INTERVAL 1 MONTH) GROUP BY project_id", mysql_real_escape_string($row['owner_id']), mysql_real_escape_string($row['id'])))or die(mysql_error()); u could try this but I dont know if it'd work ;P This is what it said for just one query, and for not multiple: Showing rows 0 - 0 (1 total, Query took 10.2801 sec) For that many rows with doing it multiple times, I would say you are sort of stuck up a tree. The query times will vary depending on how many ips the owner/project has. Limit it to a certain number, or cache the results and update it every x hours in the backend with CRON. Doubt it helps, but yea. EDIT: Thinking about it, why do multiple loops? Why not use the IN() keyword in mysql and pull it up where projects are IN that id? Or owners are IN that id then just use the loop and display all the projects for an owner? A single query would speed it up a ton. This is the cron, I am trying to speed up my cron Gotcha. You found the indexing feature, that was my next suggestion. Other than that I am not sure. You may try doing the date part with PHP, meaning instead of using DATE_SUB put the actual date in there you get from PHP. That way MySQL does not have to keep processing the DATE_SUB function... Quote Link to comment Share on other sites More sharing options...
The Little Guy Posted January 16, 2009 Author Share Posted January 16, 2009 oh wow! that makes it even faster, less than 1 second! Quote Link to comment Share on other sites More sharing options...
premiso Posted January 16, 2009 Share Posted January 16, 2009 oh wow! that makes it even faster, less than 1 second! I think that is probably about as good as it gets =) Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.