Jump to content

Get the value of "src" from an html tag.


daydreamer

Recommended Posts

Hi,

 

I am trying to get the URL from the src= of this html tag:

 

<img src="/folder/one/029225.jpg" alt="this is a photo" id="uniqueid" width="20" height="20" class="snoopyiscool" />

 

This is what I have so far:

 

<?php
preg_match("/(<img.*id=\"uniqueid\".*/>)/", $xxx, $matches);
?>

 

I get "Unknown modifier '>' ".

 

What reg exp should I be using? Thanks

Link to comment
Share on other sites

/(<img.*id=\"uniqueid\".*/>)/

 

 

 

Since / was used to start your pattern, it will also be used to end your pattern.

 

 

The basic syntax of a regexp is:

 

<start/end character><actual pattern><start/end character><modifiers>

 

 

Since you have

 

/(<img.*id=\"uniqueid\".*/>)/

 

The .*/ is seen as the end of the pattern, thus the regexp engine thinks that >)/ are all modifiers.

 

 

When ever you use the start/end character in a pattern, you must escape it (and in PHP, the escape sequence is \<char to be escaped>).

 

/(<img.*id=\"uniqueid\".*\/>)/

 

Or, you can always use a start/end char that you don't think you're likely to use in the pattern:

 

~(<img.*id=\"uniqueid\".*/>)~

 

 

 

Anyway, on to why the pattern isn't working.

 

Let's assume your pattern were ~(<img.*id=\"uniqueid\".*/>)~.

 

When ever you use (), the pattern matched inside the () are captured into a result, but in this case, your entire pattern is in the same set of capturing parenthesis.

 

So, assuming the string matches the pattern, your result will be the entire string.  Not very useful.

 

Anyway, you basically want to capture just the src part, so you should think about what an img tag is:

 

<img attributes>

 

Technically src is an attribute just like style, height, width, border, so on, so technically it can be in any order, not always <img src="...">.

 

 

Because of that, I would look at it this way:

 

 

<img{stuff here}src="{what you want to get}"{anything that's not >}>

 

So, since you don't care about the {stuff here} and {other stuff here}, you can just blindly match that:

 

 

~<img.*?src="([^"]+)"[^>]*>~

Link to comment
Share on other sites

Assuming you only want to fetch the src value of an image tag with the id="uniqueid", you could resort to DOM / XPath:

 

Example:

$str = <<<EOF
<img src="/folder/one/029225.jpg" alt="this is a photo" id="uniqueid" width="20" height="20" class="snoopyiscool" />
EOF;

$dom = new DOMDocument;
@$dom->loadHTML($str); // change loadHTML to loadHTMLFile and use a legit url in the parenthesis
$xpath = new DOMXPath($dom);
$srcTag = $xpath->query('//img[@id = "uniqueid"]');
foreach($srcTag as $val){
    echo $val->getAttribute('src');
}

Link to comment
Share on other sites

Are there any advantages doing it the DOMDocument way nrg_alpha? Can you use javascript functions with the DOMDocument?

 

No.  The DOMDocument is an extension that will take a string (loaded from a file or straight string) and parse it and build a DOM out of the tags in it.  It more or less allows you to access/manipulate things in the string in the same way as you can with the DOM built on a rendered page, with javascript.  You cannot use javascript functions specifically, but the extension does have methods and properties that mimic a lot of what you can do with javascript (like getElementById, etc...). 

 

Also it is important to note that it does not manipulate the DOM "in real time" that is generated on the client - IOW, as usual, php is server-side, and as far as it is concerned, everything that is output to the client is arbitrary text. 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.