Jump to content

[SOLVED] finding and replacing a url


tgavin

Recommended Posts

<?php
$string = 'This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>';
echo preg_replace('/(?<=<a href=")(.*?")/', 'http://www.site.com/parse.php?url=\\1', $string);

if(isset($_GET['url'])) {
echo $_GET['url'];
// produces 'http://www.google.co.uk/search?q=php'
// instead of the ENTIRE url
}
?>

 

I need a way to return the ENTIRE url. Is there a way to do this?

Link to comment
Share on other sites

this is a little more clear

 

preg_replace('/<a href="(.*)">/', '<a href="http://www.site.com/parse.php?url=\\1">', $string);

 

Thank you.

 

That's only returning the url (and it's still chopped off). I need to be able to return the entire string, with the url modified. What I have above works, except that it's chopping off the url at the first ampersand.

Link to comment
Share on other sites

my code got cut off, its:


preg_replace('/<a href="(.*)">/', "<a href='http://www.site.com/parse.php?url=$1'>", $string);

Yep, that's cleaner :)

 

I'm still encountering a problem, and I'm thinking it's more in the receiving end. When I run that regex it outputs a link in the browser. When I click that link it goes to my parse.php script - which is exactly what it's supposed to do. In the script I echo $_GET['url'], which chops off the url at the first ampersand. How can I go about getting the entire url?

Link to comment
Share on other sites

what if you just base64encode it, then decode it in the target php?

 

I originally tried url_encode() and received errors. However, it's entirely likely that I didn't know what I was doing and put it in the wrong place. Could you give me an example of base64? I'm slightly familiar with it, but have never used it. Also, would I have to change the document encoding as well?

 

If that doesn't work, I wrote this per laffin's suggestion about rebuilding - which is a pretty good suggestion.

<?php
foreach($_GET as $name=>$value) {
$param .= $name.'='.$value;
$param .= '&';
}
$url = str_replace('url=','',$param);
echo $url;
?>

 

See any (future) problems with that?

Link to comment
Share on other sites

my code got cut off, its:


preg_replace('/<a href="(.*)">/', "<a href='http://www.site.com/parse.php?url=".base64_encode($1)."'>", $string);

 

then in your php

$realurl=base64_decode($_GET['url']);

 

 

whatever works for you.

 

That's basically what I tried before with url_encode and received an error similar to what I get with yours: Parse error: syntax error, unexpected T_LNUMBER, expecting T_VARIABLE or '$'
Link to comment
Share on other sites


preg_replace('/<a href="(.*)">/', "<a href='http://www.site.com/parse.php?url=".base64_encode($1)."'>", $string);

 

i dun see how this works, preg replace needs the 2nd paramenter to be a string beforehand. it will not parse base64 while performing the repacement.

 

<?php
header("Content-type: text/plain");
$body='This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>';
echo $body."\n";
function urlencodebase64($matches)
{
$url=urlencode(base64_encode($matches[2]));
return "<A HREF=\"http://my.site.com/parse.php?url=". $url ."\">";
}

$body=preg_replace_callback('/(<a\s+.*href\s*=\s*"(.*)".*?>)/i','urlencodebase64',$body);
echo $body."\n";
?>

 

which outputs

This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>

This is a string with a url in it <A HREF="http://my.site.com/parse.php?url=aHR0cDovL3d3dy5nb29nbGUuY28udWsvc2VhcmNoP3E9cGhwJm51bT0xMDAmaGw9ZW4mc2FmZT1vZmYmc3RhcnQ9MjAwJnNhPU4%3D">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a>[/qyote[

 

why use urlencode & base64?

base64 may use some characters that a wont work on a url parameter line.

function urldecodebase64($url)
{
       return base64_decode($urldecode($url));
}

 

Link to comment
Share on other sites

Thanks!

I was looking at preg_replace_callback last night, but had nothing like this :)

Good to know that about base64_encode!

 

I'm real close now. The only problem is the matching pattern. If I have more than one url in the string (which is *very* likely), it will only return the last one. The original pattern I had worked. It won't work with this though.

 

<?php
function urlencodebase64($matches){
$url = urlencode(base64_encode($matches[2]));
return "<a href=\"http://www.ballywhonews.com/dev/test/testlink.php?url=".$url."\">";
}
function urldecodebase64($url) {
return base64_decode(urldecode($url));
}

// put it together
$string = 'This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a><p>Not only that, but here is another one <a href="http://www.aol.com">America Online</a></p>';
$new_string = preg_replace_callback('/(<a\s+.*href\s*=\s*"(.*)".*?>)/i','urlencodebase64',$string);

// echo it so we can click the links for testing
echo $new_string;

// after the link is clicked and the page is reloaded
if(isset($_GET['url'])) {
$url = urldecodebase64($_GET['url']);
echo '<p>'.$url;
}
?>

 

Link to comment
Share on other sites

<?php
function urlencodebase64($matches){
$url = urlencode(base64_encode($matches[2]));
return "<a href=\"http://www.ballywhonews.com/dev/test/testlink.php?url=".$url."\">";
}
function urldecodebase64($url) {
return base64_decode(urldecode($url));
}

// put it together
$string = 'This is a string with a url in it <a href="http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N">http://www.google.co.uk/search?q=php&num=100&hl=en&safe=off&start=200&sa=N</a><p>Not only that, but here is another one <a href="http://www.aol.com">America Online</a></p>';
$new_string = preg_replace_callback('/(<a\s+.*?href\s*=\s*"([^\s"]*)".*?>))/i','urlencodebase64',$string);

// echo it so we can click the links for testing
header("Content-type: text/plain");
echo "$string\n";
echo "$new_string\n";

// after the link is clicked and the page is reloaded
if(isset($_GET['url'])) {
$url = urldecodebase64($_GET['url']);
echo '<p>'.$url;
}
?>

 

changed the preg statement a little

Link to comment
Share on other sites

Thank you!

 

I received this error

preg_replace_callback() [function.preg-replace-callback]: Compilation failed: unmatched parentheses at offset 36

So I removed the last parentheses and ended up with this

$new_string = preg_replace_callback('/(<a\s+.*?href\s*=\s*"([^\s"]*)".*?>)/i','urlencodebase64',$string);

 

If that looks good to you too, then I'm all set!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.