wwfc_barmy_army Posted July 12, 2010 Share Posted July 12, 2010 Hello. I am testing out web scraping and I have a page I am testing with and it has a div with a class 'total_price'. I am using the Simple HTML DOM Parser with this code: // Create DOM from URL or file $html = file_get_html('**My Test Page**'); $ret = $html->find('.total_price'); print_r($ret); Although it seems to get stuck in some kind of loop and I get a lot of output text: Array ( [0] => simple_html_dom_node Object ( [nodetype] => 1 [tag] => p [attr] => Array ( [class] => total_price ) [children] => Array ( ) [nodes] => Array ( [0] => simple_html_dom_node Object ( [nodetype] => 3 [tag] => text [attr] => Array ( ) [children] => Array ( ) [nodes] => Array ( ) [parent] => simple_html_dom_node Object *RECURSION* [_] => Array ( [4] => £44.99 ) [dom:private] => simple_html_dom Object ( [root] => simple_html_dom_node Object ( [nodetype] => 5 [tag] => root [attr] => Array ( ) [children] => Array ( [0] => simple_html_dom_node Object ( [nodetype] => 6 [tag] => unknown [attr] => Array ( ) [children] => Array ( ) [nodes] => Array ( ) [parent] => simple_html_dom_node Object *RECURSION* [_] => Array ( [0] => 2 [4] => ) [dom:private] => simple_html_dom Object *RECURSION* ) [1] => simple_html_dom_node Object ( [nodetype] => 1 [tag] => html [attr] => Array ( [xmlns] => http://www.w3.org/1999/xhtml [lang] => en-GB ) [children] => Array ( [0] => simple_html_dom_node Object ( [nodetype] => 1 [tag] => head [attr] => Array ( [id] => ctl00_HtmlHead1_Head1 ) [children] => Array ( [0] => simple_html_dom_node Object ( [nodetype] => 1 [tag] => script [attr] => Array ( [type] => text/javascript ) [children] ..etc...etc... There is one point it returns the value I am trying to get (the price) which is: Array ( [4] => £44.99 ) Anyone see what i'm doing wrong? Thanks. Quote Link to comment https://forums.phpfreaks.com/topic/207460-php-web-scrape-problem/ Share on other sites More sharing options...
.josh Posted July 12, 2010 Share Posted July 12, 2010 not really enough info...my best guess is maybe your test page's html is so mal-formed the parser fails to parse it correctly. Quote Link to comment https://forums.phpfreaks.com/topic/207460-php-web-scrape-problem/#findComment-1084644 Share on other sites More sharing options...
wwfc_barmy_army Posted July 12, 2010 Author Share Posted July 12, 2010 My test page is simply: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Untitled Document</title> </head> <body> <h1>My test page</h1> <div id="header"></div> <div class="total_price">£44.99</div> <div id="footer">FOOTER</div> </body> </html> Any ideas? Thanks. Quote Link to comment https://forums.phpfreaks.com/topic/207460-php-web-scrape-problem/#findComment-1084748 Share on other sites More sharing options...
.josh Posted July 12, 2010 Share Posted July 12, 2010 okay well what does file_get_html() look like? Quote Link to comment https://forums.phpfreaks.com/topic/207460-php-web-scrape-problem/#findComment-1084834 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.