kailash001 Posted December 26, 2010 Share Posted December 26, 2010 Hello guys, i'm trying to screen scrape the original content from every RSS feed. The RSS feeds works fine however when i try to screen scrape every content using the library simple html dom. At first it works fine but when it tries to extract the second feed's original content then i get this error: Fatal error: Cannot redeclare file_get_html() (previously declared in C:\wamp\www\mashup\protected\views\articles\simple_html_dom.php:37) in C:\wamp\www\mashup\protected\views\articles\simple_html_dom.php on line 41 part of my code is as follows: foreach($RSS_DOC->channel->item as $RSSitem) { $item_id = md5($RSSitem->title); $item_title = $RSSitem->title; $item_date = date("Y-m-j G:i:s", strtotime($RSSitem->pubDate)); $item_url = $RSSitem->link; echo "Processing item '" , $item_id , "<br/>"; echo $item_title, " - "; echo $item_date, "<br/>"; echo $item_url, "<br/>"; //screen scrape original article include('simple_html_dom.php'); $html = file_get_dom($item_url); foreach($html->find('td[class=rel_headline_cmt]') as $element) { echo $element; } } Any help with this? Link to comment https://forums.phpfreaks.com/topic/222665-screen-scrape-original-content-from-rss/ Share on other sites More sharing options...
johnny86 Posted December 26, 2010 Share Posted December 26, 2010 Move the line include('simple_html_dom.php'); outside the foreach loop. You don't need to include the file at every iteration. Link to comment https://forums.phpfreaks.com/topic/222665-screen-scrape-original-content-from-rss/#findComment-1151521 Share on other sites More sharing options...
kailash001 Posted December 26, 2010 Author Share Posted December 26, 2010 Move the line include('simple_html_dom.php'); outside the foreach loop. You don't need to include the file at every iteration. Thanx for the help. But now i'm getting another problem. i'm able to extract the 1st article properly but when it extracts the 2nd one it extract it twice then the 3rd one once and then i get this error: Fatal error: Maximum execution time of 60 seconds exceeded in C:\wamp\www\mashup\protected\views\articles\simple_html_dom.php on line 70 can you tell me how can i make the script run faster? or any other solution? Link to comment https://forums.phpfreaks.com/topic/222665-screen-scrape-original-content-from-rss/#findComment-1151530 Share on other sites More sharing options...
johnny86 Posted December 26, 2010 Share Posted December 26, 2010 It's hard to say where your problem is. But it's definately some loop problem. One thing that caatched my eye is this: foreach($RSS_DOC->channel->item as $RSSitem) { Do you really need to loop trough one item? ($RSS_DOC->channel->item) Maybe loop trough the channel only? Link to comment https://forums.phpfreaks.com/topic/222665-screen-scrape-original-content-from-rss/#findComment-1151540 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.