Jump to content

[SOLVED] Problem stripping special characters for email


obsidian

Recommended Posts

Well, I'm officially stumped. I've been given a form that simply takes some user input and relays a message. I've done this a hundred times before, but apparently, I've never taken into account all the special characters. I have tried any number of Google result suggestions to strip out the special characters that are being encoded in a strange way (specifically MS Word chars).

 

Here is the string I'm inputting (Copied straight out of Word):

[pre]“Checking doubles” … along with ‘singles’

Do the – dashes work?[/pre]

 

Here are the various outputs based on some of the different things I've attempted:

 

Plain text email, no adjustments:

[pre]“Checking doubles” … along with ‘singles’

Do the ΓÇô dashes work?[/pre]

 

Plain text email, filtered for special chars:

[pre]Γ£Checking doublesΓ¥ Γª along with ΓÿsinglesΓ

Do the Γ* dashes work?[/pre]

 

HTML email, no adjustments:

[pre]“Checking doubles†… along with ‘singles’ Do the – dashes work?[/pre]

 

HTML email, filtered for special chars:

[pre]âœChecking doublesâ ⦠along with â˜singlesâ Do the â* dashes work?[/pre]

 

Here is the encoding script that I settled on for the examples above. Keep in mind this is just one of many ASCII combinations I've tried:

<?php
$body = ereg_replace(128, '', $body); // Euro symbol
$body = ereg_replace(133, '...', $body); // ellipses
$body = ereg_replace(8226, '', $body); // double prime
$body = ereg_replace(8216, "'", $body); // left single quote
$body = ereg_replace(145, "'", $body); // left single quote
$body = ereg_replace(8217, "'", $body); // right single quote
$body = ereg_replace(146, "'", $body); // right single quote
$body = ereg_replace(8220, '"', $body); // left double quote
$body = ereg_replace(147, '"', $body); // left double quote
$body = ereg_replace(8221, '"', $body); // right double quote
$body = ereg_replace(148, '"', $body); // right double quote
$body = ereg_replace(8226, "*", $body); // bullet
$body = ereg_replace(149, "*", $body); // bullet
$body = ereg_replace(8211, "-", $body); // en dash
$body = ereg_replace(150, "-", $body); // en dash
$body = ereg_replace(8212, "-", $body); // em dash
$body = ereg_replace(151, "-", $body); // em dash
$body = ereg_replace(8482, '', $body); // trademark
$body = ereg_replace(153, '', $body); // trademark
$body = ereg_replace(169, '', $body); // copyright mark
$body = ereg_replace(174, '', $body); // registration mark
?>

 

Does anyone have any suggestions on how I can clean up the user inputted code to simply reflect straight quotes and/or hyphens in place of curly quotes and/or dashes for my plain text emails without having this encoding problem?

 

I'm at a loss...

Link to comment
Share on other sites

Have you tried this?

 

<?php
$string = "My bananas are €1.00 each";
$remove = array(128,123);
for($i=0; $i<strlen($string); $i++) if(in_array(ord($string{$i}),$remove)) $string{$i} = chr(32);
echo $string;
?>

Link to comment
Share on other sites

Have you tried this?

 

<?php
$string = "My bananas are €1.00 each";
$remove = array(128,123);
for($i=0; $i<strlen($string); $i++) if(in_array(ord($string{$i}),$remove)) $string{$i} = chr(32);
echo $string;
?>

 

That works to remove the faulty characters that are being presented, but I'm not looking to just remove those characters. I'm looking at being able to recognize and replace the initial curly quotes accurately with straight quotes... :(

Link to comment
Share on other sites

That works to remove the faulty characters that are being presented, but I'm not looking to just remove those characters. I'm looking at being able to recognize and replace the initial curly quotes accurately with straight quotes... :(

My code could be easily modified though to replace with the correct character, mine just replaces all found characters with a space (chr(32))...

 

<?php
$string = "My bananas are €1.00 each";
$remove  = array(128,133);
$replace = array("E","...");
for($i=0; $i<strlen($string); $i++) {
$key = array_search(ord($string{$i}),$remove);
if($key !== FALSE) {
	$string{$i} = $replace[$key];
}
}
echo $string;
?>

Link to comment
Share on other sites

You might want to look at Convert Smart Quotes with PHP and follow some of the links in some of the comments.

 

Thanks, Ken, that's actually the function I started with and went from there.

 

See this, and this. I'm trying to get the multibyte functions installed on my Windows machine to try a few things, but I cannot get it working...

Effigy, thanks for the links. They didn't directly solve my problem, but they definitely pointed me in the right direction. Turns out, since our entire site is encoded with UTF-8, I had to declare the email header charset to be the same. I added the header parameter with UTF-8 as the message charset, and all is right with the world once again.

 

Thanks again to all those who helped me with this. You guys are the best ;)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.