Push Eject Posted September 24, 2008 Share Posted September 24, 2008 Hi all. I was unsure whether to post this in the Regex forum or here. I hope I've chosen correctly. I'm trying to call a routine to obfuscate email addresses from within a regular expression and am stumped. The following code: $body = a post to a page that may include an email address. $body = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', encodeEmail('\\1'), $body); function encodeEmail($email){ $t = "mailto:" . $email; $r = ""; for( $i = 0; $i < strlen($t); $i++ ){ $r .= "&#" . ord( substr($t,$i,1) ); } $r = "<a href=\"" . $r . "\">".$r."</a>"; return $r; } Returns: <a href="mailto:1">mailto:1</a> OR mailto:1 Can I not call a user function from within the regex? Am I referencing "\\1" wrong? More confusing to me is that if I return $email from encodeEmail() it passes the correct string (\\1) back. Any help is greatly appreciated. Thanks, Charlie Quote Link to comment Share on other sites More sharing options...
Push Eject Posted September 25, 2008 Author Share Posted September 25, 2008 bump Quote Link to comment Share on other sites More sharing options...
Push Eject Posted September 25, 2008 Author Share Posted September 25, 2008 Sorry I didn't format the code for color. I'm such a n00b. Here it is again in a more readable state: <?php $body = "blah blah john@example.com blah blah". $body = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', encodeEmail('\\1'), $body); function encodeEmail($email){ $t = "mailto:" . $email; $r = ""; for( $i = 0; $i < strlen($t); $i++ ){ $r .= "&" . "#" . ord( substr($t,$i,1) ); } $r = "<a href=\"" . $r . "\">".$r."</a>"; return $r; } ?> Returns: blah blah <a href="mailto:1">mailto:1</a> blah blah OR mailto:1 Been playing with this again this morning and cannot figure out how to properly pass the "\\1" value... Quote Link to comment Share on other sites More sharing options...
discomatt Posted September 25, 2008 Share Posted September 25, 2008 http://php.net/preg_replace_callback Quote Link to comment Share on other sites More sharing options...
Push Eject Posted September 25, 2008 Author Share Posted September 25, 2008 Thanks Disco! I *just* found that function myself and here is my solution: <?php $body = preg_replace_callback('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', 'encodeEmail', $body); function encodeEmail($email){ $t = $email[0]; $r = ""; for( $i = 0; $i < strlen($t); $i++ ){ $r .= "&" . "#" . ord( substr($t,$i,1) ); } $r = "<a href=\"mailto:" . $r . "\">".$r."</a>"; return $r; } ?> Quote Link to comment Share on other sites More sharing options...
discomatt Posted September 25, 2008 Share Posted September 25, 2008 Just for efficiency's sake.. you shouldn't need a capturing group in your regex, as index 0 will contain the entire match already. Quote Link to comment Share on other sites More sharing options...
thebadbad Posted September 25, 2008 Share Posted September 25, 2008 If you wan't to do it similar to what you tried, the /e modifier treats the replacement as PHP code: <?php $body = "blah blah john@example.com blah blah". $body = preg_replace('~([-_.0-9a-z]+@([0-9a-z][-0-9a-z]+\.)+[a-z]{2,3})~ei', "encodeEmail('\\1')", $body); function encodeEmail($email){ $t = "mailto:" . $email; $r = ""; for( $i = 0; $i < strlen($t); $i++ ){ $r .= "&" . "#" . ord( substr($t,$i,1) ); } $r = "<a href=\"" . $r . "\">".$r."</a>"; return $r; } ?> Also, in regular expressions you shouldn't escape a dot in a character class, so I removed the backslash (don't know if it allows a backslash in your code - we don't want it to). And I moved hyphens to the front of the classes, else it's defining a range (when not escaped). At last I made the search case insensitive by adding the /i modifier. Also note that TLD's (the 'extension' of a domain) can be longer than 3 chars. Quote Link to comment Share on other sites More sharing options...
Push Eject Posted September 25, 2008 Author Share Posted September 25, 2008 Thanks, everybody, this is great! Quote Link to comment Share on other sites More sharing options...
discomatt Posted September 25, 2008 Share Posted September 25, 2008 If you wan't to do it similar to what you tried, the /e modifier treats the replacement as PHP code: Also, in regular expressions you shouldn't escape a dot in a character class, so I removed the backslash (don't know if it allows a backslash in your code - we don't want it to). And I moved hyphens to the front of the classes, else it's defining a range (when not escaped). At last I made the search case insensitive by adding the /i modifier. Also note that TLD's (the 'extension' of a domain) can be longer than 3 chars. preg_replace_callback is a nicer than using the 'e' modifier IMO. Something about eval'ing generated data scares me Hyphens can also be at the end of a character class. [-a-z] == [a-z-] Quote Link to comment Share on other sites More sharing options...
thebadbad Posted September 25, 2008 Share Posted September 25, 2008 preg_replace_callback is a nicer than using the 'e' modifier IMO. Something about eval'ing generated data scares me Hyphens can also be at the end of a character class. [-a-z] == [a-z-] I think it's easier to control what goes where (concerning function parameters) when using the /e modifier, but I see what you mean. And yeah, you're right about the hyphens Quote Link to comment Share on other sites More sharing options...
discomatt Posted September 25, 2008 Share Posted September 25, 2008 The 'e' modifier is nice for plugging data into an existing function... but I'd still rather create/use a handler... <?php function exiting_function ( $arg1, $arg2, $arg3 ) { # Do something!? } function existing_function_handler ( $matches ) { return existing_function( $matches[1], $matches[2], $matches[3] ); } ?> Saves you from having to open up a potential hole Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.