Jump to content

Recommended Posts

I am trying to get a list of all entries and duplicate the entries as many as they are elegible.  The problem is teh queries take forever (hours upon hours).  There are 250,000 rows in Entries and 50,000 rows in ReferAFriend.  I can download the database no problem, it's the second query that seems to be causing the timing issue.  I can even run a query like:

 

SELECT b.*,(IF((SELECT count(a.Email) FROM ReferAFriend a WHERE a.EntryID = b.ID AND (SELECT count(ID) FROM Entries WHERE Email = a.Email) > 0 GROUP BY a.Email),(SELECT count(a.Email) FROM ReferAFriend a WHERE a.EntryID = b.ID AND (SELECT count(ID) FROM Entries WHERE Email = a.Email) > 0 GROUP BY a.Email),0) + IF(b.UPC != 0,2,1)) as EmailCount FROM Entries b

 

This query causes no issues the problem is that it puts the number of Count into EmailCount.  I want:

 

instead of: 

Name,Other,EmailCount

MyName,MyOther,3

 

show:

Name,Other

MyName,MyOther

MyName,MyOther

MyName,MyOther

 

The php code and tables are below.  Thanks in advance for the help.

 

$f = fopen("drawdata.csv", "w"); 

fwrite($f,'"First Name","Last Name","Address","Address 2","City","Province","Postal Code","Phone","Date of Birth","Email"');
fwrite($f,"\n");

$q = mysql_query("SELECT c.*, IF(c.UPC != 0,2,1) as EmailCount FROM Entries c");
while ($f = mysql_fetch_array($q)) {
$qa = mysql_query("SELECT count(DISTINCT a.Email) as EmailCount FROM ReferAFriend a, Entries b WHERE a.Email = b.Email AND a.EntryID = '$f[iD]'");
$fa = mysql_fetch_array($qa);
for($count = 1; $count <= $f[EmailCount] + $fa[EmailCount]; $count++) {
	fwrite($f,'"'.$f['FirstName'].'","'.$f['LastName'].'","'.$f['Address'].'","'.$f['Address2'].'","'.$f['City'].'","'.$f['Province'].'","'.$f['PostalCode'].'","'.$f['Phone'].'","'.$f['DOB'].'","'.$f['Email'].'"');
	fwrite($f,"\n");
}
}
fclose($f); 

 

CREATE TABLE IF NOT EXISTS `Entries` (
  `ID` int(10) NOT NULL auto_increment,
  `FirstName` varchar(200) NOT NULL,
  `LastName` varchar(200) NOT NULL,
  `Address` varchar(200) NOT NULL,
  `Address2` varchar(200) NOT NULL,
  `City` varchar(100) NOT NULL,
  `Province` char(2) NOT NULL,
  `PostalCode` varchar(7) NOT NULL,
  `Phone` varchar(20) NOT NULL,
  `DOB` date NOT NULL,
  `Email` varchar(200) NOT NULL,
  `UPC` varchar(15) NOT NULL,
  `SignUp` tinyint(1) NOT NULL,
  PRIMARY KEY  (`ID`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=245135;

CREATE TABLE IF NOT EXISTS `ReferAFriend` (
  `ID` int(10) NOT NULL auto_increment,
  `EntryID` int(10) NOT NULL,
  `Email` varchar(200) NOT NULL,
  PRIMARY KEY  (`ID`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=49235;

Link to comment
https://forums.phpfreaks.com/topic/232414-query-taking-forever-help-optimizing/
Share on other sites

haha I was afraid of that.  Ok basically "Entries" is all the registration data and "ReferAFriend" is all the people that have been sent a referral by one of the "Entries".  What happens is everytime it meets criteria:

 

Each Entry:

- Look for UPC

- Look for ReferAFriend emails addresses that are entries

 

Everytime one of the above matches, they get a bonus entry into the draw, so the another row needs to be added to the final output. 

 

So for example, say you registered:

 

Name,Email,UPC

Keith,keith@phpfreaks.com,12345

 

That Entry in Entries, would be in the final output:

 

Name,Email

Keith,keith@phpfreaks.com

Keith,keith@phpfreaks.com

 

That is because you have a UPC.  Now if you referred anyone and they are in the Entries.  Then you get more rows. 

 

Hope that explains it better.

 

Thanks.

Hi

 

Right, let me see if I have go this straight.

 

A person will have a record in Entries and 0 to many records on ReferAFriend.

 

What you want output:-

 

1 record per row on Entries

+ 1 record per row on Entries which has a UPC

+ 1 record per row on Entries per record on ReferAFriend

 

If so I think I would use UNION ALL with 3 queries.

 

SELECT a.FirstName, a.LastName, a.Email FROM Entries a
UNION ALL
SELECT a.FirstName, a.LastName, a.Email FROM Entries a WHERE UPC != 0
UNION ALL
SELECT a.FirstName, a.LastName, a.Email FROM Entries a INNER JOIN ReferAFriend b ON a.Email = b.Email

 

However I might still not have got my head around what you want.

 

All the best

 

Keith

Thanks, I had to change it to :

 

SELECT a.* FROM Entries a 
UNION ALL 
SELECT a.* FROM Entries a WHERE UPC != 0 
UNION ALL 
SELECT a.* FROM Entries a INNER JOIN ReferAFriend b ON b.EntryID = a.ID WHERE (SELECT count(ID) FROM Entries WHERE Email = b.Email) > 0

 

But it works as one query now.  Still have slow issues though.  If you look at the PHP code above, I changed it to:

 

$f = fopen("drawdata.csv", "w"); 

fwrite($f,'"First Name","Last Name","Address","Address 2","City","Province","Postal Code","Phone","Date of Birth","Email"');
fwrite($f,"\n");

$q = mysql_query("SELECT a.* FROM Entries a UNION ALL SELECT a.* FROM Entries a WHERE UPC != 0 UNION ALL SELECT a.* FROM Entries a INNER JOIN ReferAFriend b ON b.EntryID = a.ID WHERE (SELECT count(ID) FROM Entries WHERE Email = b.Email) > 0");
while ($f = mysql_fetch_array($q)) {
fwrite($f,'"'.$f['FirstName'].'","'.$f['LastName'].'","'.$f['Address'].'","'.$f['Address2'].'","'.$f['City'].'","'.$f['Province'].'","'.$f['PostalCode'].'","'.$f['Phone'].'","'.$f['DOB'].'","'.$f['Email'].'"');
fwrite($f,"\n");
}
fclose($f); 

 

With this, it outputs the header into the file, but continues to be doing something, but nothing else gets put in the document.  It is not the code, because I made a smaller Entries database that works fine.  Is there a setting or something I can change in the my.ini? 

Hi

 

Not sure why you had to change it. The 3rd part of the query would only find any records where there was at least 1 matching record on the ReferAFriend table. With the change you have you are forcing a LOT of subselects which will be very inefficient.

 

However I am unsure what you are trying to do with the change you did. I think I have misunderstood the linking between the 2 tables and how you want to pull rows out.

 

All the best

 

Keith

The reason for the change is that EntryID is what links the Entry to the ReferAFriend.  Then from there, we need to make sure that the email address in ReferAFriend is in Entries.  As people can refer lots of people, but they only get a bonus entry if that email has entered. 

 

Sorry if it's confusing.  I appreciate all the help.

 

Do you happen to know why I can't even get 1 record outputted?  I am running this query through MySQL as well and it runs fast with smaller tables, but it takes forever for this large table, which I am sure at some point will time out. 

Hi

 

Suspect you are not getting 1 record due to it timing out.

 

Try this:-

 

SELECT a.* FROM Entries a

UNION ALL

SELECT a.* FROM Entries a WHERE UPC != 0

UNION ALL

SELECT a.* FROM Entries a INNER JOIN ReferAFriend b ON b.EntryID = a.ID INNER JOIN Entries c ON b.Email = c.Email

 

(this is assuming that Email is unique on the Entries table)

 

All the best

 

Keith

Hi

 

No as the idea is to use an INNER JOIN so non matching records will be dropped. Using a LEFT OUTER JOIN would mean that records would still be brought back when there was no matching record on ReferAFriend.

 

You can try this to remove the duplicates:-

 

SELECT a.* FROM Entries a 
UNION ALL 
SELECT a.* FROM Entries a WHERE UPC != 0 
UNION ALL 
SELECT a.* FROM Entries a INNER JOIN ReferAFriend b ON b.EntryID = a.ID INNER JOIN (SELECT DISTINCT Email FROM Entries) c ON b.Email = c.Email 

 

Not sure about changing a setting in MySQL, but in effect you are trying to perform 250000 separate selects which will take ages.

 

All the best

 

Keith

That works much faster, thanks. 

 

Still not sure how to get this to work on the large database though.  Through PHPMyAdmin and PHP script as shown above (only with adjust queries) does the same thing and I guess times out before it outputs anything.   

Hi

 

What will probably help greatly is some suitable indexes on the tables (ie, on EntryID on ReferAFriend, and on Email on both tables).

 

Best to also only return the columns that you actually want to write out (SELECT * is a useful feature when knocking up quick queries but shouldn't really be used for production code).

 

On the php side you may as well put the carraige return write onto the end of the main write, rather than do 2 separate fwrite statements.

 

All the best

 

Keith

Ok will try the indexes.

 

I usually do, but in this case we are using all fields, actually now that it is all in one query, we arn't using ID.

 

The reason I do this is that it has to be within double quotes and I use single quotes to avoid having to escape all the double quotes needed.  Your right though.

 

Thanks for the help.  Think I need to speak to a server administrator about MySQL performance for this stuff now. 

Hi

 

You aren't outputting UPC or signup either.

 

If you only bring back the relevant fields then you could do something like:-

 

fwrite($file,"'".implode("','",$f)."'\n");

 

By the way just noticed you are using the same variable name ($f) for both the file and for the array you are retrieving rows into.

 

All the best

 

Keith

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.