Jump to content

Recommended Posts

I have a page that the source code looks similar to this.

 

<div class="middle">				
	<div id="displayimage">
		<a href="http://aserver.com/347398r"><img src="http://aserver.com/images/no_pic.gif" alt="" /></a>
	</div>

 

Now of course this is within a page that is actually around 110 kb and is crammed with image liks, javascript, etc.  What I want to do is load the page remotely and extract that image link. 

 

I tried fopen and several other but if the file does not exist they throw an error.  I then tried preg_match_quote to extract this info but that did not work.  while the div is always the same that image will change every page.

 

What this is for is a script where someone can add their myspace ID and it will get their profile image and show it on my page for them.  Any help would be greatly appreciated.

Link to comment
https://forums.phpfreaks.com/topic/161430-help-extracting-something/
Share on other sites

Basicall I want to echo a variable and have

http://aserver.com/images/no_pic.gif

display.  Of couse that image will be different.

 

if I can just get this

<a href="http://aserver.com/347398r"><img src="http://aserver.com/images/no_pic.gif" alt="" /></a>

 

to display I will be happy.

Regular expressions:

 

<?php
//ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; da; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10');
//$url = 'http://example.com/';
//$html = file_get_contents($url);
$html = '   <div class="middle">            
      <div id="displayimage">
         <a href="http://aserver.com/347398r"><img src="http://aserver.com/images/no_pic.gif" alt="" /></a>
      </div>';
preg_match('~<div id="displayimage">\s*<a[^>]+><img src="([^"]+)~i', $html, $matches);
$link = $matches[1];
?>

Just uncomment the first three lines, insert the real URL and remove the other $html. Then you should be good.

 

Didn't work. This is what I got.

 

function add($a)
{
ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; da; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10');
$html = file_get_contents($a);
preg_match('~<div id="displayimage">\s*<a[^>]+><img src="([^"]+)~i', $html, $matches);
$link = $matches[1];
echo "$link";
die();
}

All it returns is an empty page.

preg_match('~<div id="displayimage">.*?<img src="([^"]*)~is',$html,$matches);

 

If that doesn't work, try:

 

preg_match('~<div[^>]*id\s?=\s?["\']displayimage["\'][^>]*>.*?<img[^>]*src\s?=\s?["\']([^"\']*)~is',$html,$matches);

 

2nd is not as efficient but gives a bit of breathing room for variation of coding.

  • 2 months later...


preg_match('~<div[^>]*id\s?=\s?["\']displayimage["\'][^>]*>.*?<img[^>]*src\s?=\s?["\']([^"\']*)~is',$html,$matches);

 

Can you guys point to where I can find the breakdown of this?  For instance  What does [^'] mean?  etc?  Can't seem to find anything about it in the php manual.

[pre]

 

~<div[^>]*id\s?=\s?["\']displayimage["\'][^>]*>.*?<img[^>]*src\s?=\s?["\']([^"\']*)~is

 

~              start of pattern delimiter

<div          literal match

[^>]*          match 0 or more of anything that is not a >

id            literal match

\s?            match 0 or 1 space or tab

=              literal match

\s?            match 0 or 1 space or tab

["\']          match a single or double quote (single quote escaped since it is used to wrap the pattern)

displayimage  literal match

["\']          match a single or double quote (single quote escaped since it is used to wrap the pattern)

[^>]*          match 0 or more of anything that is not a >

>              literal match

.*?            non-greedy match of 0 or more of anything

<img          literal match

[^>]*          match 0 or more of anything that is not a >

src            literal match

\s?            match 0 or 1 space or tab

=              literal match

\s?            match 0 or 1 space or tab

["\']          match a single or double quote (single quote escaped since it is used to wrap the pattern)

(              start of a group/match capture

[^"\']*        match 0 or more of anything that is not a single or double quote

)              end of a group/match capture

~              end of pattern delimiter

i              modifier to make matching case-insensitive

s              modifier to make quantifiers ignore newline character while matching

[/pre]

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.