abazoskib Posted July 26, 2009 Share Posted July 26, 2009 i have a live data feed coming into my database from several different sources. i do not accept duplicate entries for the email address field, and so i check each email coming in with a function: function checkDuplicates($email){ $queryCD = "SELECT email FROM customer WHERE email='$email'"; $resultCD = mysql_query($queryCD); $numRows = mysql_num_rows($resultCD); if($numRows > 0) return false; //EMAIL IS A DUPLICATE else return true; //EMAIL IS GOOD } I need to optimize everything to the max. Am I doing it the best way possible? Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/ Share on other sites More sharing options...
ldougherty Posted July 26, 2009 Share Posted July 26, 2009 You could add LIMIT 1 to the end of your query so if it does exist it will only find the first result and then exit the query from there rather than continuing the search. SELECT email FROM customer WHERE email='$email' LIMIT 1 I'd also probably change the query itself to select count so it returns a number rather that doing a query and then returning num rows. My function would be.. function checkDuplicates($email){ $queryCount = "mysql_query(SELECT count(email) WHERE email='$email' LIMIT 1)"; if($queryCount > 0) return false; //EMAIL IS A DUPLICATE else return true; //EMAIL IS GOOD } Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-883533 Share on other sites More sharing options...
abazoskib Posted July 26, 2009 Author Share Posted July 26, 2009 yep, i was thinking that. however i could not find documentation that count(*) or count(column_name) is faster than what i am doing here. or maybe i should use 'having'? Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-883535 Share on other sites More sharing options...
fenway Posted August 3, 2009 Share Posted August 3, 2009 No, you shouldn't use HAVING... and the difference between COUNT(*) and COUNT(col) only has to do with NULL values. Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-889926 Share on other sites More sharing options...
abazoskib Posted August 3, 2009 Author Share Posted August 3, 2009 ok thanks. also i noticed a simple select is faster than a select count. i guess this query is as optimized as it can get Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-889936 Share on other sites More sharing options...
fenway Posted August 3, 2009 Share Posted August 3, 2009 also i noticed a simple select is faster than a select count. What do you mean?? Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-889990 Share on other sites More sharing options...
abazoskib Posted August 4, 2009 Author Share Posted August 4, 2009 there are over a million records in the one table i have and currently growing. i noticed that if i do a select where email=$email for example, i get a result faster than a select count(email) where email=$email. A few seconds faster actually which makes all the difference. Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-890175 Share on other sites More sharing options...
roopurt18 Posted August 4, 2009 Share Posted August 4, 2009 You know there is a INSERT ... ON DUPLICATE KEY UPDATE ... syntax, right? Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-890228 Share on other sites More sharing options...
abazoskib Posted August 4, 2009 Author Share Posted August 4, 2009 You know there is a INSERT ... ON DUPLICATE KEY UPDATE ... syntax, right? yes, but i need to know when the email is a duplicate for analytical reasons. also, the primary key is composed of an auto incremented key and an email, so an email could potentially be inserted twice, knowing that the first part of the key will always be unique. Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-890319 Share on other sites More sharing options...
aschk Posted August 4, 2009 Share Posted August 4, 2009 I'm taking a guess here (because i've not checked MySQL docs) but if you make your email column a unique index (as you should be doing if it's unique anyway) then the INSERT ... ON DUPLICATE KEY UPDATE... should work for you. However, because you don't actually want to UPDATE anything on a DUPLICATE then you're better off using IGNORE e.g. INSERT IGNORE INTO <tablename>('col', ....) VALUES(1,...); However, because an INSERT has a higher overhead than a SELECT you would probably be better off doing the SELECT 1st (as you were). Going back to the original point SELECT query, I see no need for the LIMIT 1 because if you're saying that your email field is unique (because you don't accept dupes) then you will only ever get 1 result. Thus: <?php /** * Email class for performing acts on email addresses. */ class Email { private $_email; private $_isDuplicate; public function __construct($email, $db = null){ $this->_email = $email; if(null === $db) { $db = mysql_connect('localhost','username','password'); mysql_select_db('database'); } $this->_db = $db; } private function getDB(){ return $this->_db; } public function getEmail(){ return $this->_email; } public function isDuplicate(){ if(null === $this->_isDuplicate){ $this->_isDuplicate = false; $result = mysql_fetch_row(mysql_query( sprintf( "SELECT COUNT(*) FROM customer WHERE email='%s'", mysql_real_escape_string($this->getEmail()) ) , $this->getDB() )); if($result[0]){ $this->_isDuplicate = true; } } return $this->_isDuplicate; } } /** * @example */ foreach($emails as $email){ $test = new Email($email, $db); if(!$test->isDuplicate()){ // INSERT new record. } } ?> The above code is pretty extravagant for what you're looking for, but you get the idea I hope. Quote Link to comment https://forums.phpfreaks.com/topic/167549-searching-for-duplicates-before-insertion/#findComment-890450 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.