Jump to content


Photo

Smart Quotes = ‘ ??


  • Please log in to reply
11 replies to this topic

#1 neoform

neoform
  • Members
  • PipPipPip
  • Advanced Member
  • 241 posts
  • LocationMontreal

Posted 16 October 2006 - 01:38 PM

For some reason, when i submit a textfield with this:

“spies” ‘ticket’

as the value through a POST form, then insert it into a mysql table, this is what ends up in the table:

‘spies’ “ticket”


To sanitize my forms i use this:

function safe_string($str)
{
	if (get_magic_quotes_gpc())
		$str = stripslashes($str);

	//tried this to replace the smart quotes..  no good. :(
	$search 	= array('‘','’','“','”');
    	$replace 	= array("'","'",'"','"');
	$str 		= ereg_replace($search, $replace, $str);
	
	return mysql_real_escape_string(trim($str));
}

Anyone know why this might be happening? My table is using "latin1 -- cp1252 West European" as the charset, and i checked PHP, it's using UTF8..  it's breaking my brain since i've been messing with this for about 10 hours now and I still can't figure it out..
Newsique.com Social News Network

#2 effigy

effigy
  • Staff Alumni
  • Advanced Member
  • 3,600 posts
  • LocationIL

Posted 16 October 2006 - 03:00 PM

If you print_r $_POST, do you see the smart quotes, or the combination of special characters?
Regexp | Unicode Article | Letter Database
/\A(e)?((1)?ff(?:(?:ig)?y)?|f(?:ig)?)\z/

#3 neoform

neoform
  • Members
  • PipPipPip
  • Advanced Member
  • 241 posts
  • LocationMontreal

Posted 16 October 2006 - 03:05 PM

i see the smart quotes..

if i print_r the query it'self then manually run the query, it ends up looking right in the database, but if php runs the query, it's the 3 chars instead of the smart quotes.. 
Newsique.com Social News Network

#4 effigy

effigy
  • Staff Alumni
  • Advanced Member
  • 3,600 posts
  • LocationIL

Posted 16 October 2006 - 03:23 PM

Latin 1 - 1252 has:

91 2018 LEFT SINGLE QUOTATION MARK
92 2019 RIGHT SINGLE QUOTATION MARK
93 201C LEFT DOUBLE QUOTATION MARK
94 201D RIGHT DOUBLE QUOTATION MARK


Try:

$str = preg_replace('/[\x91-\x92]/', "'", $str);
$str = preg_replace('/[\x93-\x94]/', '"', $str);

Regexp | Unicode Article | Letter Database
/\A(e)?((1)?ff(?:(?:ig)?y)?|f(?:ig)?)\z/

#5 neoform

neoform
  • Members
  • PipPipPip
  • Advanced Member
  • 241 posts
  • LocationMontreal

Posted 16 October 2006 - 03:46 PM

strangely..  that didn't do the trick..  :S

i get the impression PHP is somehow encoding the string strangely or maybe the form is..  eitherway, i don't get it. :(
Newsique.com Social News Network

#6 kenrbnsn

kenrbnsn
  • Staff Alumni
  • Advanced Member
  • 8,235 posts
  • LocationHillsborough, NJ, USA

Posted 16 October 2006 - 04:25 PM

Take a look at this article. It may be of some help.

Ken

#7 neoform

neoform
  • Members
  • PipPipPip
  • Advanced Member
  • 241 posts
  • LocationMontreal

Posted 16 October 2006 - 04:41 PM

that was actually the first thing i tried.. :(

i checked the query and even with both those methods being used, “spies” ‘ticket’ still ends up being the resulting string..

this is perplexing..
Newsique.com Social News Network

#8 effigy

effigy
  • Staff Alumni
  • Advanced Member
  • 3,600 posts
  • LocationIL

Posted 16 October 2006 - 04:49 PM

If PHP is using UTF-8, you may need to utf8_decode the $_POST string before running the replace.
Regexp | Unicode Article | Letter Database
/\A(e)?((1)?ff(?:(?:ig)?y)?|f(?:ig)?)\z/

#9 neoform

neoform
  • Members
  • PipPipPip
  • Advanced Member
  • 241 posts
  • LocationMontreal

Posted 16 October 2006 - 05:01 PM

closer!!

haha

now it does:
?spies? ?ticket?

dagnabbit.
Newsique.com Social News Network

#10 neoform

neoform
  • Members
  • PipPipPip
  • Advanced Member
  • 241 posts
  • LocationMontreal

Posted 16 October 2006 - 06:30 PM

i guess no one else has any ideas?

I've pretty much exausted all my ideas, i tried toying around in php.ini and httpd.conf to see if it would do anything..

maybe i should use mbstring ?
Newsique.com Social News Network

#11 effigy

effigy
  • Staff Alumni
  • Advanced Member
  • 3,600 posts
  • LocationIL

Posted 16 October 2006 - 07:08 PM

Are you specifying the charset in your meta? This works for me:

<meta charset="cp1252">
<pre>
<?php
		print_r($_POST);
		foreach ($_POST as $k => &$v) {
			$v = preg_replace('/[\x93-\x94]/', '"', $v);
		}
		print_r($_POST);
?> 
</pre>
<form method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>">
	<textarea name="test"><?php echo "\x93test\x94"; ?></textarea>
	<input type="submit"/>
</form>

Regexp | Unicode Article | Letter Database
/\A(e)?((1)?ff(?:(?:ig)?y)?|f(?:ig)?)\z/

#12 neoform

neoform
  • Members
  • PipPipPip
  • Advanced Member
  • 241 posts
  • LocationMontreal

Posted 16 October 2006 - 07:27 PM

omg! that's it!!

THAT's what it was..  the page's content type. ack, i just copied/pasted it from another page.. ehhh

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

every other site on my server uses

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

Which explains why this problem has never cropped up on me.. :P
Newsique.com Social News Network




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users