Jump to content

[SOLVED] Calling a User Function Within a Regex


Push Eject

Recommended Posts

Hi all.  I was unsure whether to post this in the Regex forum or here.  I hope I've chosen correctly.

 

I'm trying to call a routine to obfuscate email addresses from within a regular expression and am stumped.

 

The following code:

$body = a post to a page that may include an email address.

$body = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', encodeEmail('\\1'), $body);

function encodeEmail($email){
$t = "mailto:" . $email;
$r = "";

for( $i = 0; $i < strlen($t); $i++ ){
	$r .= "&#" . ord( substr($t,$i,1) );
}

$r = "<a href=\"" . $r . "\">".$r."</a>";

return $r;
}

 

Returns:

<a href="&#109&#97&#105&#108&#116&#111&#58&#49">&#109&#97&#105&#108&#116&#111&#58&#49</a>
OR
mailto:1

 

Can I not call a user function from within the regex?  Am I referencing "\\1" wrong?

 

More confusing to me is that if I return $email from encodeEmail() it passes the correct string (\\1) back.

 

Any help is greatly appreciated.

 

Thanks,

Charlie

Link to comment
Share on other sites

Sorry I didn't format the code for color.  I'm such a n00b. :)  Here it is again in a more readable state:

 

<?php
$body = "blah blah john@example.com blah blah".

$body = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', encodeEmail('\\1'), $body);

function encodeEmail($email){
$t = "mailto:" . $email;
$r = "";

for( $i = 0; $i < strlen($t); $i++ ){
	$r .= "&" . "#" . ord( substr($t,$i,1) );
}

$r = "<a href=\"" . $r . "\">".$r."</a>";

return $r;
}
?>

 

Returns:

blah blah <a href="&#109&#97&#105&#108&#116&#111&#58&#49">&#109&#97&#105&#108&#116&#111&#58&#49</a> blah blah

OR

mailto:1

 

Been playing with this again this morning and cannot figure out how to properly pass the "\\1" value...

Link to comment
Share on other sites

Thanks Disco!

 

I *just* found that function myself and here is my solution:

<?php
$body = preg_replace_callback('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', 'encodeEmail', $body);

function encodeEmail($email){

$t = $email[0];
$r = "";

for( $i = 0; $i < strlen($t); $i++ ){
	$r .= "&" . "#" . ord( substr($t,$i,1) );
}

$r = "<a href=\"&#109a&#105l&#116o&#58" . $r . "\">".$r."</a>";

return $r;
}
?>

Link to comment
Share on other sites

If you wan't to do it similar to what you tried, the /e modifier treats the replacement as PHP code:

 

<?php
$body = "blah blah john@example.com blah blah".

$body = preg_replace('~([-_.0-9a-z]+@([0-9a-z][-0-9a-z]+\.)+[a-z]{2,3})~ei', "encodeEmail('\\1')", $body);

function encodeEmail($email){
$t = "mailto:" . $email;
$r = "";

for( $i = 0; $i < strlen($t); $i++ ){
	$r .= "&" . "#" . ord( substr($t,$i,1) );
}

$r = "<a href=\"" . $r . "\">".$r."</a>";

return $r;
}
?>

 

Also, in regular expressions you shouldn't escape a dot in a character class, so I removed the backslash (don't know if it allows a backslash in your code - we don't want it to). And I moved hyphens to the front of the classes, else it's defining a range (when not escaped). At last I made the search case insensitive by adding the /i modifier.

 

Also note that TLD's (the 'extension' of a domain) can be longer than 3 chars.

Link to comment
Share on other sites

If you wan't to do it similar to what you tried, the /e modifier treats the replacement as PHP code:

Also, in regular expressions you shouldn't escape a dot in a character class, so I removed the backslash (don't know if it allows a backslash in your code - we don't want it to). And I moved hyphens to the front of the classes, else it's defining a range (when not escaped). At last I made the search case insensitive by adding the /i modifier.

 

Also note that TLD's (the 'extension' of a domain) can be longer than 3 chars.

 

preg_replace_callback is a nicer than using the 'e' modifier IMO. Something about eval'ing generated data scares me ;)

Hyphens can also be at the end of a character class. [-a-z] == [a-z-]

Link to comment
Share on other sites

preg_replace_callback is a nicer than using the 'e' modifier IMO. Something about eval'ing generated data scares me ;)

Hyphens can also be at the end of a character class. [-a-z] == [a-z-]

 

I think it's easier to control what goes where (concerning function parameters) when using the /e modifier, but I see what you mean. And yeah, you're right about the hyphens :)

Link to comment
Share on other sites

The 'e' modifier is nice for plugging data into an existing function... but I'd still rather create/use a handler...

 

<?php

function exiting_function ( $arg1, $arg2, $arg3 ) {
# Do something!?
}

function existing_function_handler ( $matches ) {
return existing_function( $matches[1], $matches[2], $matches[3] );
}

?>

 

Saves you from having to open up a potential hole ;)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.