vanner Posted April 18, 2006 Share Posted April 18, 2006 I'm writing a script that snoops for an image on a webpage and prints it out from the source. Can someone help me out with the reg expression? Also, does my coding goin in the right direction with what I want to do?[code]<?php// Fetch webpage$url = file_get_contents('http://example.com/todays_image/');// Look for this image which is changed daily. <img src="/todays_image/girls/<RANDOM FOLDER NAME>/<RANDOM FILENAME>.jpg"preg_match("/<img src=\"\/todays_girl\/girls\/([a-zA-Z_0-9]\/([a-zA-Z_0-9]\+.jpg)/i",$url, $matches);$file = $matches[1];echo '<html><head><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"><img src="http://example.com/todays_girl/$file"></head></html>';?> [/code] Quote Link to comment Share on other sites More sharing options...
ypirc Posted April 18, 2006 Share Posted April 18, 2006 I altered your regex a bit...first of all in the 'example' you provided it was called todays_image and in the regex you use todays_girl ... Also you are escaping the '+' sign when infact you should be escaping the period, I also fixed a problem you were having with '()', anyway...hope this helps, the rest of your code looks like it should work, however, if you provide us the actual link so we can check it out, we might be able to form a better regex. I'm attaching the modified version of your regex...and then another one that I made seperately that uses look behind/ahead assertions..[code]fixed version of yours:preg_match("#<img src=\"/todays_image/girls/([a-zA-Z_0-9]+/[a-zA-Z_0-9]+\.jpg)#i",$url, $matches);look ahead/behind assertions:preg_match('#(?<=<img src="/todays_image/girls/).+\.jpg(?=">)#i', $url, $matches);Note: in the look ahead/behind assertion you use $matches[0] not $matches[1][/code] Quote Link to comment Share on other sites More sharing options...
vanner Posted April 18, 2006 Author Share Posted April 18, 2006 thanks for the quick reply. I'm trying to grab Maxim Online's daily girl image on [a href=\"http://maximonline.com/todays_girl/\" target=\"_blank\"]http://maximonline.com/todays_girl/[/a]. They put the daily image in a [a href=\"http://maximonline.com/todays_girl/girls/\" target=\"_blank\"]http://maximonline.com/todays_girl/girls/[/a] directory. Example:[a href=\"http://maximonline.com/todays_girl/girls/rachel_nichols/gfd_med.jpg\" target=\"_blank\"]http://maximonline.com/todays_girl/girls/r...ols/gfd_med.jpg[/a]So basically, I want my script to look at [a href=\"http://maximonline.com/todays_girl\" target=\"_blank\"]http://maximonline.com/todays_girl[/a] source and grab:[code]<img src="/todays_girl/girls/rachel_nichols/gfd_med.jpg"[/code]and print out the image on my webpage.Go ahead and test the code yourself:[code]<?php// Fetch webpage$url = file_get_contents('http://maximonline.com/todays_girl/');// Look for this image: <img src="/todays_girl/girls/*/*.jpg"preg_match('#(?<=<img src="/todays_image/girls/).+\.jpg(?=">)#i', $url, $matches);$file = $matches[1];echo ' <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <img src="http://maximonline.com/todays_girl/girls/$file"></head></html>';?>[/code] Quote Link to comment Share on other sites More sharing options...
ypirc Posted April 18, 2006 Share Posted April 18, 2006 Well, at first you said the name was todays_image, and I looked and saw it says todays_girl. Also, I didn't realize there were extra parameters in the img tag. Use this:[code]preg_match('#(?<=<img src="/todays_girl/girls/).+\.jpg(?=")#i', $url, $matches);[/code][!--quoteo--][div class=\'quotetop\']QUOTE[/div][div class=\'quotemain\'][!--quotec--]%php -f test.phpArray( [0] => rachel_nichols/gfd_med.jpg)[/quote] Quote Link to comment Share on other sites More sharing options...
vanner Posted April 18, 2006 Author Share Posted April 18, 2006 thanks! that worked. much appreciated. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.