pplexr Posted August 4, 2007 Share Posted August 4, 2007 i have large number of html files contains many internal links and i wanna use some of them on another domain i want to replace the internal links <a href="/folder/images/myimage.gif">my image</a> <a href="myfile.html">my File</a> <a href="javascript:submitSearch($('srchFormToolbar'));" class="hdSearchAlt">Search</a> so the result will be my image my File Search and external links like this will be the same <a href="www.domain.com/folder/images/myimage.gif">my image</a> <a href="http://www.domain.com/folder/images/myimage.gif">my image</a> <a href="domain.com/folder/images/myimage.gif">my image</a> can i do this with regex? thanks Quote Link to comment Share on other sites More sharing options...
effigy Posted August 6, 2007 Share Posted August 6, 2007 Something like this? This is only a base to work from--not a full-featured approach. <pre> <?php $tests = array( '<a href="/folder/images/myimage.gif">my image</a>', '<a href="myfile.html">my File</a>', '<a href="javascript:submitSearch($(\'srchFormToolbar\'));" class="hdSearchAlt">Search</a>' ); $domain = 'http://www.mydomain.com'; foreach ($tests as $test) { echo htmlspecialchars($test); echo '<br>'; echo htmlspecialchars( preg_replace( '#(?<=href=")(?!javascript:)(?:http://|/)?([^"]+)#e', '$domain . "/" . "\1"', $test ) ); echo '<hr>'; } ?> </pre> Quote Link to comment Share on other sites More sharing options...
pplexr Posted August 6, 2007 Author Share Posted August 6, 2007 no what i want is somthing like this (<a href="(?!(http://|ftp://|www.))(?:[\w\W])*?>([\w\W]*?)</a>) <a href="www.domain.com/folder/images/myimage.gif">myimage.gif</a> FAIL <a href="http://www.domain.com/folder/images/myimage.gif">myimage.gif</a> FAIL <a href="ftp://www.domain.com/folder/images/myimage.gif">myimage.gif</a> FAIL <a href="javascript:submitSearch($(\'srchFormToolbar\'));" class="hdSearchAlt">Search</a> PASS <a href="/folder/images/myimage.gif">myimage.gif</a> PASS <?php $pat = '/(<a href="(?!(http:\/\/|ftp:\/\/|www.))(?:[\w\W])*?>([\w\W]*?)<\/a>)/'; $content=' <a href="www.domain.com/folder/images/myimage.gif">myimage.gif</a> <a href="http://www.domain.com/folder/images/myimage.gif">myimage.gif</a> <a href="ftp://www.domain.com/folder/images/myimage.gif">myimage.gif</a> <a href="javascript:submitSearch($(\'srchFormToolbar\'));" class="hdSearchAlt">Search</a> <a href="/folder/images/myimage.gif">myimage.gif</a> '; if(preg_match_all($pat,$content,$matches,PREG_SET_ORDER)) { foreach ($matches as $match) { $content=str_replace($match[1],$match[3],$content); //echo $match[1].$match[2]; } echo $content; } ?> the result will be <a href="www.domain.com/folder/images/myimage.gif">myimage.gif</a> <a href="http://www.domain.com/folder/images/myimage.gif">myimage.gif</a> <a href="ftp://www.domain.com/folder/images/myimage.gif">myimage.gif</a> Search myimage.gif so i removed the internal link and replaced it with "Search" and "myimage.gif" any comments about improving it to be more accurate ? Quote Link to comment Share on other sites More sharing options...
effigy Posted August 8, 2007 Share Posted August 8, 2007 1. I'm not sure why you're grouping \w with \W. 2. Use non-capturing parentheses when you don't need to capture: (?: ... ) 3. You can reduce (http://|ftp://) to (??:ht|f)tp://). 4. Make sure you use \. to match a literal period. 5. The href value may not always use double quotes. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.