Jump to content

Archived

This topic is now archived and is closed to further replies.

ira19

extract url from anchor tag

Recommended Posts

Hello,
  Can anyone please tell me how to extract the url from an anchor tag?
Eg:<a href="www.url.com/index.html">HTML Code </a>
I want www.url.com/index.html as well as HTML code....
Please help!!!

Share this post


Link to post
Share on other sites
$tag = '<a href="www.url.com/index.html">';

$url = ereg_replace('/<a href="(?!")">/','\1',$tag);

Share this post


Link to post
Share on other sites
I get an error for
$tag = '<a href="www.url.com/index.html">';
$url = ereg_replace('/<a href="(?!\")">/','\1',$tag);

besides the format can be either <a href="www.something.com/index.php">
or <a href='www.something.com'>

Share this post


Link to post
Share on other sites
try this:
[code]
<?php
$string = "<a href=\"www.url.com/index.html\">My URL</a>\n";
preg_match('|<a.+?href\="(.+?)".*?>(.+?)</a>|i', $string, $match);
$url = $match[1];
$text = $match[2];
?>
[/code]

just tried it, and it seems to work well.

Share this post


Link to post
Share on other sites
[quote author=ToonMariner link=topic=106459.msg425873#msg425873 date=1157110986]
$tag = '<a href="www.url.com/index.html">';
$url = ereg_replace('/<a href="(?!")">/','\1',$tag);
[/quote]

ereg does not support negative lookaheads, and even if you go to preg, the expression with not work. You're telling it to find a double quote that is not followed by a double quote, that is followed by a double quote: an impossibility.

Share this post


Link to post
Share on other sites
Thanks for the help but let me explain in details what i exactly need.
  I want to search say www.domainname.com in other sites say www.ab.com,www.cc.com
  When i find that the domainname is present i want to retrieve the link as well as the text
i.e <a href="http://www.domainname.com">My domain</a>
the anchor text may also be like
<a href="http://www.domainname.com" target="_blank">My domain</a>
or
<a href="http://www.domainname.com"><font size="2">My domain</font></a>
and the output should be [b]www.domainname.com [/b]and [b]My domain[/b].

Please help!!!


Share this post


Link to post
Share on other sites
Here's what I got about it:
you want to have
    [B]< a href="an_absolute_path/optional/folder/file">an_absolute_path</ a> <strong>label_name</strong>[/B]
from something like
      [B]< a href="an_absolute_path/optional/folder/file"><span>label_name</span></ a>[/B]


[CODE]

$rex = '/(<a.+?href="https?:\/\/([^"]+?)(?:\/[^"]+)*".*?>)(.+?)<\/a>/iex' ;
$rpl = '"$1$2</a> <strong>".strip_tags("$3")."</strong>"' ;
$res =  preg_replace( $rex,$rpl,$string ) ;

[/CODE]
But probably I didn't get it  :)

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.