phpx Posted January 18, 2012 Share Posted January 18, 2012 Hello every one this is my first post. I am making a traffic cam app and i like to use page scraping to pull information trafficland.com to a different page. http://trafficland.com/city/DET/index.html Following site generate a token every 10 minute I like to pull only the pubtoken=283d5730c53e406f7ea051c4c89c31fd to a different page. Source code of http://trafficland.com/city/DET/index.html: <div id="cams" class='module'> <div style="height: 125px;"> <img id="8480" width="115" height="78" src="http://pub2.camera.trafficland.com/image/live.jpg?webid=8480&system=topcams&size=half&pubtoken=283d5730c53e406f7ea051c4c89c31fd" onmouseout='hideHelper()' onmouseover='showHelper("Detroit","I-275 @ Grand River Ave")' onclick='viewTopTen("DET","I-275 @ Grand River Ave","South","MDOT","4000","48335","8480")'/> Any help will be good. Thank you Quote Link to comment Share on other sites More sharing options...
phpx Posted January 18, 2012 Author Share Posted January 18, 2012 This is what i got so far. Image is only pulling up to http://pub2.camera.trafficland.com/image/live.jpg how can i make it to pull all the way including the token: http://pub2.camera.trafficland.com/image/live.jpg?webid=8480&system=topcams&size=full&pubtoken=283d5730c53e406f7ea051c4c89c31fd <?php /* Basic scraping demo with "foreach" and "regex" parsing * Owen Mundy Copyright 2011 GNU/GPL */ // url to start $url = "http://trafficland.com/city/DET/index.html"; // get contents of url in an array $lines = file($url); // look for the string foreach ($lines as $line_num => $line) { // find opening string if(strpos($line, '<div style="height: 125px;">')) { $get_content = true; } // if opening string is found // then print content until closing string appears if($get_content == true) { $data .= $line . "\n"; } // closing string if(strpos($line, "</div>")) { $get_content = false; } } // use regular expressions to extract only what we need... // png, jpg, or gif inside a src="..." or src='...' $pattern = "/src=[\"']?([^\"']?.*(png|jpg|gif))[\"']?/i"; preg_match_all($pattern, $data, $images); // text from link $pattern = "/(<a.*>)(\w.*)(<.*>)/ismU"; preg_match_all($pattern, $data, $text); // link $pattern = "/(href=[\"'])(.*?)([\"'])/i"; preg_match_all($pattern, $data, $link); /* // test if you like print "<pre>"; print_r($images); print_r($text); print_r($link); print "</pre>"; */ ?> <html> <head> <style> body { margin:0; } .textblock { position:absolute; top:600px; left:0px; } span { font:5.0em/1.0em Arial, Helvetica, sans-serif; line-height:normal; background:url(trans.png); color:#fff; font-weight:bold; padding:5px } a { text-decoration:none; color:#900 } </style> </head> <body> <img src="<?php print $images[1][0] ?>" height="100%"> </div> <div class="textblock"><span><a href="<?php print "http://www.bbc.co.uk".$link[2][0] ?>"><?php print $text[2][0] ?></a></span><br> </div> </body> </html> Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.