Jump to content

[SOLVED] finding and replacing a url


tgavin

Recommended Posts

<?php
$string = 'This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>';
echo preg_replace('/(?<=<a href=")(.*?")/', 'http://www.site.com/parse.php?url=\\1', $string);

if(isset($_GET['url'])) {
echo $_GET['url'];
// produces 'http://www.google.co.uk/search?q=php'
// instead of the ENTIRE url
}
?>

 

I need a way to return the ENTIRE url. Is there a way to do this?

Link to comment
https://forums.phpfreaks.com/topic/87941-solved-finding-and-replacing-a-url/
Share on other sites

this is a little more clear

 

preg_replace('/<a href="(.*)">/', '<a href="http://www.site.com/parse.php?url=\\1">', $string);

 

Thank you.

 

That's only returning the url (and it's still chopped off). I need to be able to return the entire string, with the url modified. What I have above works, except that it's chopping off the url at the first ampersand.

my code got cut off, its:


preg_replace('/<a href="(.*)">/', "<a href='http://www.site.com/parse.php?url=$1'>", $string);

Yep, that's cleaner :)

 

I'm still encountering a problem, and I'm thinking it's more in the receiving end. When I run that regex it outputs a link in the browser. When I click that link it goes to my parse.php script - which is exactly what it's supposed to do. In the script I echo $_GET['url'], which chops off the url at the first ampersand. How can I go about getting the entire url?

what if you just base64encode it, then decode it in the target php?

 

I originally tried url_encode() and received errors. However, it's entirely likely that I didn't know what I was doing and put it in the wrong place. Could you give me an example of base64? I'm slightly familiar with it, but have never used it. Also, would I have to change the document encoding as well?

 

If that doesn't work, I wrote this per laffin's suggestion about rebuilding - which is a pretty good suggestion.

<?php
foreach($_GET as $name=>$value) {
$param .= $name.'='.$value;
$param .= '&';
}
$url = str_replace('url=','',$param);
echo $url;
?>

 

See any (future) problems with that?

my code got cut off, its:


preg_replace('/<a href="(.*)">/', "<a href='http://www.site.com/parse.php?url=".base64_encode($1)."'>", $string);

 

then in your php

$realurl=base64_decode($_GET['url']);

 

 

whatever works for you.

 

That's basically what I tried before with url_encode and received an error similar to what I get with yours: Parse error: syntax error, unexpected T_LNUMBER, expecting T_VARIABLE or '$'


preg_replace('/<a href="(.*)">/', "<a href='http://www.site.com/parse.php?url=".base64_encode($1)."'>", $string);

 

i dun see how this works, preg replace needs the 2nd paramenter to be a string beforehand. it will not parse base64 while performing the repacement.

 

<?php
header("Content-type: text/plain");
$body='This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>';
echo $body."\n";
function urlencodebase64($matches)
{
$url=urlencode(base64_encode($matches[2]));
return "<A HREF=\"http://my.site.com/parse.php?url=". $url ."\">";
}

$body=preg_replace_callback('/(<a\s+.*href\s*=\s*"(.*)".*?>)/i','urlencodebase64',$body);
echo $body."\n";
?>

 

which outputs

This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>

This is a string with a url in it <A HREF="http://my.site.com/parse.php?url=aHR0cDovL3d3dy5nb29nbGUuY28udWsvc2VhcmNoP3E9cGhwJm51bT0xMDAmaGw9ZW4mc2FmZT1vZmYmc3RhcnQ9MjAwJnNhPU4%3D">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>[/qyote[

 

why use urlencode & base64?

base64 may use some characters that a wont work on a url parameter line.

function urldecodebase64($url)
{
       return base64_decode($urldecode($url));
}

 

Thanks!

I was looking at preg_replace_callback last night, but had nothing like this :)

Good to know that about base64_encode!

 

I'm real close now. The only problem is the matching pattern. If I have more than one url in the string (which is *very* likely), it will only return the last one. The original pattern I had worked. It won't work with this though.

 

<?php
function urlencodebase64($matches){
$url = urlencode(base64_encode($matches[2]));
return "<a href=\"http://www.ballywhonews.com/dev/test/testlink.php?url=".$url."\">";
}
function urldecodebase64($url) {
return base64_decode(urldecode($url));
}

// put it together
$string = 'This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a><p>Not only that, but here is another one <a href="http://www.aol.com">America Online</a></p>';
$new_string = preg_replace_callback('/(<a\s+.*href\s*=\s*"(.*)".*?>)/i','urlencodebase64',$string);

// echo it so we can click the links for testing
echo $new_string;

// after the link is clicked and the page is reloaded
if(isset($_GET['url'])) {
$url = urldecodebase64($_GET['url']);
echo '<p>'.$url;
}
?>

 

<?php
function urlencodebase64($matches){
$url = urlencode(base64_encode($matches[2]));
return "<a href=\"http://www.ballywhonews.com/dev/test/testlink.php?url=".$url."\">";
}
function urldecodebase64($url) {
return base64_decode(urldecode($url));
}

// put it together
$string = 'This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a><p>Not only that, but here is another one <a href="http://www.aol.com">America Online</a></p>';
$new_string = preg_replace_callback('/(<a\s+.*?href\s*=\s*"([^\s"]*)".*?>))/i','urlencodebase64',$string);

// echo it so we can click the links for testing
header("Content-type: text/plain");
echo "$string\n";
echo "$new_string\n";

// after the link is clicked and the page is reloaded
if(isset($_GET['url'])) {
$url = urldecodebase64($_GET['url']);
echo '<p>'.$url;
}
?>

 

changed the preg statement a little

Thank you!

 

I received this error

preg_replace_callback() [function.preg-replace-callback]: Compilation failed: unmatched parentheses at offset 36

So I removed the last parentheses and ended up with this

$new_string = preg_replace_callback('/(<a\s+.*?href\s*=\s*"([^\s"]*)".*?>)/i','urlencodebase64',$string);

 

If that looks good to you too, then I'm all set!

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.