Joeker Posted August 21, 2008 Share Posted August 21, 2008 I'm working on a simple search engine, but I am having a problem with the scraper I wrote. All I want to do is get the page using file_get_contents and insert the title content into my database. The problem is, since its scraping so many pages the script is timing out or not scraping properly. Heres my code: <?php set_time_limit(0); $dbhost = "localhost"; $dbuser = "*****"; $dbpass = "*****"; $dbname = "*****"; mysql_connect($dbhost, $dbuser, $dbpass); mysql_select_db($dbname); function get ($a,$b,$c) { $y = explode($b,$a); $x = explode($c,$y[1]); return $x[0]; } for ($i = 1; $i <= 1000000; $i++) { $content = file_get_contents("http://www.website.com/page.php?id=$i"); $title = get($content, "<title>", "</title>"); mysql_query("INSERT INTO spider (title) VALUES ('$title')"); } ?> Anyone? Link to comment https://forums.phpfreaks.com/topic/120700-help-with-scraper-for-spider/ Share on other sites More sharing options...
The Little Guy Posted August 21, 2008 Share Posted August 21, 2008 with a script like this, you should really have it running through command line. Link to comment https://forums.phpfreaks.com/topic/120700-help-with-scraper-for-spider/#findComment-621980 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.