lip9000 Posted May 10, 2008 Share Posted May 10, 2008 Hey guys, just wondering how would I go about creating a script that scans a url's source code and brings back data contained within certain regions of that page. For example how would I get it to bring back the information contained within the sites title tags, and bring back information contained within a certain field on that page. So the script would be to enter a url, then it scans that page, gets the values located in the spots we want like the title and such, and then brings it back to the page in variables so we can then insert it into a database. Thanks in advance Quote Link to comment Share on other sites More sharing options...
bilis_money Posted May 10, 2008 Share Posted May 10, 2008 try regex. Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 but how do I get php to scan a url's source code first, then use regular expressions right ? Quote Link to comment Share on other sites More sharing options...
xenophobia Posted May 10, 2008 Share Posted May 10, 2008 try CURL. Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 cool thanks, so it would be like <?php print read_file('http://example.com'); ?> Then use reg expression to find whats in between the title tags Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 what the??? Fatal error: Call to undefined function read_file() in D:\wamp\www\phpnuke\html\tp\url.php on line 9 Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 Could someone please explain how I would go about using <?php print read_file('http://example.com'); ?> I am getting that error when I use it. Quote Link to comment Share on other sites More sharing options...
BlueSkyIS Posted May 10, 2008 Share Posted May 10, 2008 readfile() Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 how would I use preg replace when i'm not replacing anything?? what would be the code to find that value is inbetween the title tags <title>find this info</title> Quote Link to comment Share on other sites More sharing options...
DarkWater Posted May 10, 2008 Share Posted May 10, 2008 $str = ""; //Whatever you use to get your title... $title = eregi_replace("<title>([a-zA-Z0-9[:punct:][:space:]]+)</title>", "\\1", $str); Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 doesn't seem to work. what am i doing wrong? $str = readfile('http://youtube.com/watch?v=7pVmmsuuc5U'); $title = eregi_replace("<title>([a-zA-Z0-9[:punct:][:space:]]+)</title>", "\\1", $str); echo $title; so basically i want it to return the title of the youtube page Quote Link to comment Share on other sites More sharing options...
DarkWater Posted May 10, 2008 Share Posted May 10, 2008 Oh, it's a youtube page. $begin = strpos($str, "<title>"); $end = strpos($str, "</title>"); $title = substr($str, $begin+7, $end); That works for me. Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 Sorry to be annoying, but the code just displays the youtube page, and doesn't echo out the title?? My code is: $str = print readfile('http://youtube.com/watch?v=7pVmmsuuc5U'); $begin = strpos($str, "<title>"); $end = strpos($str, "</title>"); $title = substr($str, $begin+7, $end); echo $title; what's going on? Just displaying the url, no echoed title. Quote Link to comment Share on other sites More sharing options...
BlueSkyIS Posted May 10, 2008 Share Posted May 10, 2008 remove the print before readfile. Quote Link to comment Share on other sites More sharing options...
DarkWater Posted May 10, 2008 Share Posted May 10, 2008 Yeah, don't do $str = print readfile(); Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 10, 2008 Author Share Posted May 10, 2008 $str = readfile('http://youtube.com/watch?v=7pVmmsuuc5U'); $begin = strpos($str, "<title>"); $end = strpos($str, "</title>"); $title = substr($str, $begin+7, $end); echo $title; No print before readfile still displays the whole youtube page, and still does not echo out the title! I'm testing locally on WAMP, does it need to be tested live?? Or is there still something wrong with the code? Quote Link to comment Share on other sites More sharing options...
thebadbad Posted May 10, 2008 Share Posted May 10, 2008 readfile() outputs a file. Use file_get_contents(). Quote Link to comment Share on other sites More sharing options...
lip9000 Posted May 11, 2008 Author Share Posted May 11, 2008 ahhh cheers mate, got it working :D:D Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.