Jump to content

Recommended Posts

I have cURL grabbing a web page, but I need to parse the page and get a link tag. Is this the best way to grab the tag (I still haven't tested it)?

 

if(preg_match("~<link(.*?)rel=\"image_src\"(.*?)href=\"(.*?)\"~",$opt,$matches)){
$title = $matches[1];
}else{
$title = 'No Title Found!';
}

Link to comment
https://forums.phpfreaks.com/topic/143567-link-tag-with-certain-attributes/
Share on other sites

OK, I modified it, this does work, but is it the best way to do it?

 

if(preg_match("~<link(.+?)>~",$opt,$matches)){
if(preg_match("~rel=\"image_src\"~",$matches[0],$matches)){
	if(preg_match("~href=\"(.*?)\"~",$matches[0],$matches)){
		$imgSrc = '<img src="'.$matches[0].'" />';
	}
}
}

Don't "yeah but" me, son.  You've been here long enough and have posted more than enough to know how it goes.  You didn't get a regex that accounts for that, because you didn't ask for it.  You didn't even say what you were trying to get inside the link tag.  I just made a guess.  Only thing you did actually say is that you were trying to "grab a link tag".  Technically I could have given you ~<link[^>]*>~ and sent you on your way.  It's obvious to you that that's not what you want, because you know what you want.  We don't.  We're not psychic.  So we've gone from:

 

"I need to grab a link tag"

~<link[^>]*>~

 

to

 

"I need to get the stuff in-between the quotes of the href="..." inside a link tag (that extra part I assumed from your coding efforts, not from anything you actually bothered to mention)

~<link.+?href="([^"]*)"[^>]*>~

 

Hopefully you can see the difference between those two patterns, or at least see that they are different, because your information was more specific.  Now you want to be more specific and only grab the info from tags that contain rel="image_src".  So is that the exact thing that's going to be in there, or is image_src going to be saying different things, and you really mean to say rel="anythingcanbehere" ?

This is why I cannot emphesis enough that people need to take the time while constructing a post with regards to a regex question to stop and think about what exactly they are looking for.. It is extremely common place here within the regex forum especially, to have people ask one version of their problem, to have it answered, then to come back with requirements that weren't specified initially. Truth be told, it isn't fair to the people helping out, as it turns out more often than not to be a waste of time in the end, as the solution isn't adequate (simply due to miscommunication).

 

There is a reason for this sticky thread. People really ought to adhere to what is mentioned within that thread. It makes logical sense. While I am certainly not trying to pick sides, it feels almost like an epidemic developing. I often request an example string or two, and the end results of what they are looking of (as in, let's pretend we plugged in the correct regex, show me what the absolute end results should look like [not in regex form, but in string form, or array matched/captured form), and include notes of what in the string might be dynamic, and what MUST be matched/captured/what have you). Saves time and frustration from both parties involved.

Sorry...

 

I have another one, I just cant seem to grab it, please help.

 

preg_match("~<(link(.+?)rel=\"(shortcut icon|icon)\"[^>])*>~");

 

Here is what I woul like this preg_match to match (if this tag exist in the HTML Document):

 

- rel="shortcut icon" OR rel="icon"

 

I can't get it because:

- ref and href are sometimes reversed

- all the link tags are on the same line

 

 

Thanks for the help...

$str = '<link href="/css/somefile.css" rel="stylesheet" type="text/css" /><link rel="shortcut icon" href="/favicon.ico" />';
preg_match('#<link.+?(rel=([\'"])(?:shortcut )?icon\2)#i', $str, $match);
echo $match[1];

 

Output:

 

rel="shortcut icon"

 

If it is simply a boolean test you are looking for, you can use:

if(preg_match('#<link.+?rel=([\'"])(?:shortcut )?icon\1#i', $str)){
   echo 'true';
} else {
   echo 'false';
}

$str = '<link href="/css/somefile.css" rel="stylesheet" type="text/css" /><link rel="shortcut icon" href="/favicon.ico" />';
preg_match('#<link.+?(rel=([\'"])(?:shortcut )?icon\2)#i', $str, $match);
echo $match[1];

 

Output:

 

rel="shortcut icon"

 

If it is simply a boolean test you are looking for, you can use:

if(preg_match('#<link.+?rel=([\'"])(?:shortcut )?icon\1#i', $str)){
   echo 'true';
} else {
   echo 'false';
}

 

Thanks, can I get that to return the href value though?

$str = '<link rel="shortcut icon" href="/favicon.ico" />';
if(preg_match('#<link.+?rel=([\'"])(?:shortcut )?icon\1.+?href=([\'"])([^\2]+)\2[^/>]*/?>#i', $str, $match)){
   echo $match[3];
} else {
   echo 'No match found.';
}

Will that work if the tag looks like one of these?

 

<link rel="shortcut icon" href="/favicon.ico" />

<link href="/favicon.ico" rel="shortcut icon" />

 

I have see sites where rel is before href and vise verse, so I was just wondering if that will work on those sites?

Ok, Little Guy.. I supplied the meat and potatoes.. you supply the gravy:

 

#$str = '<link rel="shortcut icon" href="/favicon.ico" />';
#$str = '<link href="/favicon.ico" />';
#$str = '<link rel="shortcut icon" />';
$str = '<link href="/favicon.ico" rel="shortcut icon" />';
if(preg_match('#<link(?:.+?(?:href|rel)=[\'"](??<icon>(?:shortcut )?icon)|(?<path>[^\"]+))[\'"])+.*>#i', $str, $match)){
    foreach($match as $key=>$val){
        if(empty($key) || $key!='icon' && $key != 'path' || $val == ''){
            unset($match[$key]);
        }
    }
    echo '<pre>'.print_r($match, true);
} else {
    echo 'Error... no valid link tag found...';
}

 

I gave the captures some names tomake it eaiser for you to choose which one to use (icon and path) [i really didn't need to do that.. it's just more clear labelling for you]. Unfortunately, the regex engine by nature will still assign values to $1, $2, etc.. so I just strip out any empty, non icon and non path results.. what you are left with is simply [icon] and / or [path]. Configure / fine tune this to your liking.

 

Cheers

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.