russelburgraymond Posted June 8, 2009 Share Posted June 8, 2009 I have a page that the source code looks similar to this. <div class="middle"> <div id="displayimage"> <a href="http://aserver.com/347398r"><img src="http://aserver.com/images/no_pic.gif" alt="" /></a> </div> Now of course this is within a page that is actually around 110 kb and is crammed with image liks, javascript, etc. What I want to do is load the page remotely and extract that image link. I tried fopen and several other but if the file does not exist they throw an error. I then tried preg_match_quote to extract this info but that did not work. while the div is always the same that image will change every page. What this is for is a script where someone can add their myspace ID and it will get their profile image and show it on my page for them. Any help would be greatly appreciated. Quote Link to comment Share on other sites More sharing options...
russelburgraymond Posted June 8, 2009 Author Share Posted June 8, 2009 Basicall I want to echo a variable and have http://aserver.com/images/no_pic.gif display. Of couse that image will be different. if I can just get this <a href="http://aserver.com/347398r"><img src="http://aserver.com/images/no_pic.gif" alt="" /></a> to display I will be happy. Quote Link to comment Share on other sites More sharing options...
thebadbad Posted June 8, 2009 Share Posted June 8, 2009 Regular expressions: <?php //ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; da; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10'); //$url = 'http://example.com/'; //$html = file_get_contents($url); $html = ' <div class="middle"> <div id="displayimage"> <a href="http://aserver.com/347398r"><img src="http://aserver.com/images/no_pic.gif" alt="" /></a> </div>'; preg_match('~<div id="displayimage">\s*<a[^>]+><img src="([^"]+)~i', $html, $matches); $link = $matches[1]; ?> Just uncomment the first three lines, insert the real URL and remove the other $html. Then you should be good. Quote Link to comment Share on other sites More sharing options...
russelburgraymond Posted June 8, 2009 Author Share Posted June 8, 2009 You totally rock. Thx. I will give that a try. Quote Link to comment Share on other sites More sharing options...
russelburgraymond Posted June 8, 2009 Author Share Posted June 8, 2009 Didn't work. This is what I got. function add($a) { ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; da; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10'); $html = file_get_contents($a); preg_match('~<div id="displayimage">\s*<a[^>]+><img src="([^"]+)~i', $html, $matches); $link = $matches[1]; echo "$link"; die(); } All it returns is an empty page. Quote Link to comment Share on other sites More sharing options...
.josh Posted June 8, 2009 Share Posted June 8, 2009 preg_match('~<div id="displayimage">.*?<img src="([^"]*)~is',$html,$matches); If that doesn't work, try: preg_match('~<div[^>]*id\s?=\s?["\']displayimage["\'][^>]*>.*?<img[^>]*src\s?=\s?["\']([^"\']*)~is',$html,$matches); 2nd is not as efficient but gives a bit of breathing room for variation of coding. Quote Link to comment Share on other sites More sharing options...
russelburgraymond Posted June 9, 2009 Author Share Posted June 9, 2009 The second one works. You guys totally rock. This is awesome. Quote Link to comment Share on other sites More sharing options...
russelburgraymond Posted August 14, 2009 Author Share Posted August 14, 2009 preg_match('~<div[^>]*id\s?=\s?["\']displayimage["\'][^>]*>.*?<img[^>]*src\s?=\s?["\']([^"\']*)~is',$html,$matches); Can you guys point to where I can find the breakdown of this? For instance What does [^'] mean? etc? Can't seem to find anything about it in the php manual. Quote Link to comment Share on other sites More sharing options...
thebadbad Posted August 14, 2009 Share Posted August 14, 2009 Here's a regular expressions reference I often use: http://www.regular-expressions.info/reference.html Quote Link to comment Share on other sites More sharing options...
.josh Posted August 14, 2009 Share Posted August 14, 2009 [pre] ~<div[^>]*id\s?=\s?["\']displayimage["\'][^>]*>.*?<img[^>]*src\s?=\s?["\']([^"\']*)~is ~ start of pattern delimiter <div literal match [^>]* match 0 or more of anything that is not a > id literal match \s? match 0 or 1 space or tab = literal match \s? match 0 or 1 space or tab ["\'] match a single or double quote (single quote escaped since it is used to wrap the pattern) displayimage literal match ["\'] match a single or double quote (single quote escaped since it is used to wrap the pattern) [^>]* match 0 or more of anything that is not a > > literal match .*? non-greedy match of 0 or more of anything <img literal match [^>]* match 0 or more of anything that is not a > src literal match \s? match 0 or 1 space or tab = literal match \s? match 0 or 1 space or tab ["\'] match a single or double quote (single quote escaped since it is used to wrap the pattern) ( start of a group/match capture [^"\']* match 0 or more of anything that is not a single or double quote ) end of a group/match capture ~ end of pattern delimiter i modifier to make matching case-insensitive s modifier to make quantifiers ignore newline character while matching [/pre] Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.