n1concepts Posted July 13, 2014 Share Posted July 13, 2014 Hi, I was reviewing a php web scraping write up which is found at http://imbuzu.wordpress.com/tag/web-scraping and discovered there is a syntax error in the author's code: THE ERROR IS ON THIS LINE (FULL SET OF CODE CAN BE FOUND AT AUTHOR'S SITE - link above) for ($i = 0; $i getElementsByTagName('td'); (I'm posting below): Note - i can't understand the logic of the 'for' loop, getElementsByTagName function to fix the problem so asking for help to make this work as the author suggested. <?php error_reporting(E_ERROR); $url = "http://www.imdb.com/chart/"; $curl = curl_init($url); curl_setopt($curl, CURLOPT_RETURNTRANSFER, true); $document = curl_exec($curl); //echo $document; $dom_rep = new DOMDocument; $dom_rep->loadHTML($document); $all_trs = $dom_rep->getElementsByTagName('tr'); $trs_we_want = array(); foreach ($all_trs as $tr) { $class_name = $tr->getAttribute('class'); if (preg_match("/chart_(even|odd)_row/", $class_name)) { $trs_we_want[] = $tr; } } for ($i = 0; $i getElementsByTagName('td'); $the_tds_arr = array(); foreach ($the_tds as $td) { $the_tds_arr[] = $td; } $movie_title = $the_tds_arr[2]->nodeValue; $rank = $the_tds_arr[0]->nodeValue; $weekend = $the_tds_arr[3]->nodeValue; $gross = $the_tds_arr[4]->nodeValue; $weeks = $the_tds_arr[5]->nodeValue; echo "<div>"; echo "<h2>$movie_title</h2>"; echo "Rank: $rank<br />"; echo "Weekend: $weekend<br />"; echo "Gross: $gross<br />"; echo "Weeks: $weeks<br />"; echo "</div>"; } ?> Link to comment https://forums.phpfreaks.com/topic/289804-php-syntax-error-need-help-to-understand/ Share on other sites More sharing options...
requinix Posted July 13, 2014 Share Posted July 13, 2014 Yeah. See how "$movie_title" is messed up? I'd make an educated guess that the next character after the "$i" was a less-than - you know, the symbol that marks the beginning of an HTML tag? The blog and/or author and/or code plugin whatever is stupid with HTML to the point of leaving some HTML markup unescaped and "sanitizing" other. Looks like it should read for ($i = 0; $i < count($trs_we_want); $i++) { // everything from the < $the_tds = $trs_we_want[$i]->getElementsByTagName('td'); // to the next > was removed Link to comment https://forums.phpfreaks.com/topic/289804-php-syntax-error-need-help-to-understand/#findComment-1484863 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.