RyanSF07 Posted January 9, 2011 Share Posted January 9, 2011 Hello, I'm using this plain and simple script below to count page views. Is there a way to identify a spider bot with php and then do something like.. if this visitor is a bot, don't count it. If not, count it? Here's my simple script: $querySelect = mysql_query("SELECT * FROM video WHERE video.id = '$_GET[id]'"); $rowcount = mysql_fetch_assoc($querySelect); $count = $rowcount['counter']; if (empty($count)) { $counter = 1; $insert = mysql_query("INSERT INTO video (counter) VALUES ($counter) WHERE video.id = '$_GET[id]'"); } $add = $count+1; $insertNew = mysql_query("UPDATE video SET counter='$add' WHERE video.id = '$_GET[id]'"); thanks, Ryan Quote Link to comment Share on other sites More sharing options...
BlueSkyIS Posted January 9, 2011 Share Posted January 9, 2011 a good bot is hard to spot. bad bots aren't as hard to spot. as far as I know, the most sure-fire way to detect a bot is to require that the visitor execute Javascript code to count as a hit. but with php you can check for things like user agent, reverse dns to try to acquire the remote domain name, other header information that might be available, etc. Quote Link to comment Share on other sites More sharing options...
Pikachu2000 Posted January 9, 2011 Share Posted January 9, 2011 As an aside, there's no reason to run 3 database queries in that script. In phpMyAdmin, alter the table so any future entries get 0 as a default value in the `counter` field then set all of the empty `counter` fields to 0 (only needs to be done once). UPDATE `video` SET `counter` = 0 WHERE `counter` = '' OR `counter` IS NULL Then change the script above so that when a valid visitor is detected it executes this query (after validating/sanitizing $_GET['id'] of course): "UPDATE `video` SET `counter` = (`counter` + 1) WHERE `id` = {$_GET['id']}" Quote Link to comment Share on other sites More sharing options...
RyanSF07 Posted January 13, 2011 Author Share Posted January 13, 2011 Thank you, Picachu. That worked perfectly. Here's what I have now (below). My question now is do I have to run this array -- or -- is their some identifying "tag" that all bots have that flags them as a bot? That way I could just check for that tag, and if it's present -- not count the page view. Please let me know if you have any ideas. Thank you again for your help. Ryan $botarray = array( "Teoma", "alexa", "froogle", "inktomi", "looksmart", "URL_Spider_SQL", "Firefly", "NationalDirectory", "Ask Jeeves", "TECNOSEEK", "InfoSeek", "WebFindBot", "girafabot", "crawler", "Googlebot", "Scooter", "Slurp", "appie", "FAST", "WebBug", "Spade", "ZyBorg"); foreach($botarray as $botname) { if(ereg($botname, $HTTP_USER_AGENT)) { $recep = "me@yahoo.com"; $subject = "... bot"; $text = "$botname"; $headers = "X-Mailer: PHP\n"; mail("$recep","$subject","$text","$headers"); } else { $a = TRUE; } } if ($a) { mysql_query("UPDATE `video` SET `counter` = (`counter` + 1) WHERE `id` = {$_GET['id']}"); }; Quote Link to comment Share on other sites More sharing options...
RyanSF07 Posted January 13, 2011 Author Share Posted January 13, 2011 BlueSky -- I'm googling around for a javascript that once executed would trigger a page count... Quote Link to comment Share on other sites More sharing options...
BlueSkyIS Posted January 13, 2011 Share Posted January 13, 2011 Nice bots will read and follow your robots.txt file, and they will give you plenty of information in headers to recognize them, including user agent. Bad bots (yandex.ru for one) and data scraper bots will ignore robots.txt and they will give no indication that they are bots except that they (probably) will not execute javascript. They will usually include a regular web browser user agent. Quote Link to comment Share on other sites More sharing options...
RyanSF07 Posted January 13, 2011 Author Share Posted January 13, 2011 Thanks Bluesky, So is there a way to skip the array and do something like: if(ereg($HTTP_USER_AGENT = bot)) { $a=FALSE } else { $a = TRUE; If so, can you point me in the right direction? thanks! Ryan Quote Link to comment Share on other sites More sharing options...
BlueSkyIS Posted January 13, 2011 Share Posted January 13, 2011 i would probably loop over the array of bot agents, using stristr() to compare to user_agent. btw: Don't use ereg, as it's deprecated. If you want regex matching, use preg_match or other preg_ functions. Quote Link to comment Share on other sites More sharing options...
RyanSF07 Posted January 13, 2011 Author Share Posted January 13, 2011 K. Thank you very much for your help, BlueSky. If anyone knows of a way I can either skip the array by identifying some tell-tale attribute of a bot -- something all or most bots have in common, please let me know. thanks again, Ryan Quote Link to comment Share on other sites More sharing options...
BlueSkyIS Posted January 14, 2011 Share Posted January 14, 2011 there is nothing that all bots have in common. the lowest common denominator is their usual inability to parse and execute javascript. but even smarter bots can parse and execute some javascript. for what it's worth: i write bot code and write code to (attempt to) detect bots. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.