Jump to content

Replace internal links


pplexr

Recommended Posts

i have large number of html files contains many internal links and i wanna use some of them on another domain

i want to replace the internal links

<a href="/folder/images/myimage.gif">my image</a>
<a href="myfile.html">my File</a>
<a href="javascript:submitSearch($('srchFormToolbar'));" class="hdSearchAlt">Search</a>

so the result will be

my image
my File
Search

 

and external links like this will be the same

<a href="www.domain.com/folder/images/myimage.gif">my image</a>
<a href="http://www.domain.com/folder/images/myimage.gif">my image</a>
<a href="domain.com/folder/images/myimage.gif">my image</a>

 

can i do this with regex?

 

thanks

Link to comment
https://forums.phpfreaks.com/topic/63345-replace-internal-links/
Share on other sites

Something like this? This is only a base to work from--not a full-featured approach.

 

<pre>
<?php
$tests = array(
	'<a href="/folder/images/myimage.gif">my image</a>',
	'<a href="myfile.html">my File</a>',
	'<a href="javascript:submitSearch($(\'srchFormToolbar\'));" class="hdSearchAlt">Search</a>'
);
$domain = 'http://www.mydomain.com';
foreach ($tests as $test) {
	echo htmlspecialchars($test);
	echo '<br>';
	echo htmlspecialchars(
		preg_replace(
			'#(?<=href=")(?!javascript:)(?:http://|/)?([^"]+)#e',
			'$domain . "/" . "\1"',
			$test
		)
	);
	echo '<hr>';
}
?>
</pre>

no what i want is somthing like this

 

(<a href="(?!(http://|ftp://|www.))(?:[\w\W])*?>([\w\W]*?)</a>)

<a href="www.domain.com/folder/images/myimage.gif">myimage.gif</a>         FAIL
<a href="http://www.domain.com/folder/images/myimage.gif">myimage.gif</a>  FAIL
<a href="ftp://www.domain.com/folder/images/myimage.gif">myimage.gif</a>     FAIL

<a href="javascript:submitSearch($(\'srchFormToolbar\'));" class="hdSearchAlt">Search</a>  PASS
<a href="/folder/images/myimage.gif">myimage.gif</a>         PASS

 


<?php
$pat = '/(<a href="(?!(http:\/\/|ftp:\/\/|www.))(?:[\w\W])*?>([\w\W]*?)<\/a>)/';
$content='
<a href="www.domain.com/folder/images/myimage.gif">myimage.gif</a>
<a href="http://www.domain.com/folder/images/myimage.gif">myimage.gif</a>
<a href="ftp://www.domain.com/folder/images/myimage.gif">myimage.gif</a>
<a href="javascript:submitSearch($(\'srchFormToolbar\'));" class="hdSearchAlt">Search</a>
<a href="/folder/images/myimage.gif">myimage.gif</a> 
';

if(preg_match_all($pat,$content,$matches,PREG_SET_ORDER))
{
foreach ($matches as $match) 
{
$content=str_replace($match[1],$match[3],$content);
//echo $match[1].$match[2];
}
echo $content;
}
?>

 

the result will be

<a href="www.domain.com/folder/images/myimage.gif">myimage.gif</a>
<a href="http://www.domain.com/folder/images/myimage.gif">myimage.gif</a>
<a href="ftp://www.domain.com/folder/images/myimage.gif">myimage.gif</a>
Search
myimage.gif 

 

so i removed the internal link and replaced it with "Search" and "myimage.gif"

 

any comments  about improving it to be more accurate  ?

1. I'm not sure why you're grouping \w with \W.

2. Use non-capturing parentheses when you don't need to capture: (?: ... )

3. You can reduce (http://|ftp://) to (?:(?:ht|f)tp://).

4. Make sure you use \. to match a literal period.

5. The href value may not always use double quotes.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.