gibs Posted May 1, 2007 Share Posted May 1, 2007 Hello, I have been having some issues with preg_replace. For some reason the following code is turning all ' and " into ?. The code is below. Does anyone have any ideas as to why? function bbcode ($str) { $simple_search = array( '/\[b\](.*?)\[\/b\]/is', '/\[i\](.*?)\[\/i\]/is', '/\[u\](.*?)\[\/u\]/is', '/\[url\=(.*?)\](.*?)\[\/url\]/is', '/\[url\](.*?)\[\/url\]/is', '/\[align\=(left|center|right)\](.*?)\[\/align\]/is', '/\[img\](.*?)\[\/img\]/is', '/\[mail\=(.*?)\](.*?)\[\/mail\]/is', '/\[mail\](.*?)\[\/mail\]/is', '/\[font\=(.*?)\](.*?)\[\/font\]/is', '/\[size\=(.*?)\](.*?)\[\/size\]/is', '/\[color\=(.*?)\](.*?)\[\/color\]/is', ); $simple_replace = array( '<strong>$1</strong>', '<em>$1</em>', '<u>$1</u>', '<a href="$1">$2</a>', '<a href="$1">$1</a>', '<div style="text-align: $1;">$2</div>', '<img src="$1" />', '<a href="mailto:$1">$2</a>', '<a href="mailto:$1">$1</a>', '<span style="font-family: $1;">$2</span>', '<span style="font-size: $1;">$2</span>', '<span style="color: $1;">$2</span>', ); // Do simple BBCode's $str = preg_replace ($simple_search, $simple_replace, $str); $str = nl2br($str); return $str; } Thanks! Quote Link to comment Share on other sites More sharing options...
Wildbug Posted May 2, 2007 Share Posted May 2, 2007 That's a good question. I don't see anything in your code that would cause that to happen, but I'll try to help. Have you narrowed this behavior down to a specific line? I.e., can you check the contents of the variable before and after each instruction to isolate the precise instruction that causes this to occur? This sounds like a character encoding issue. Are you sure the HTML consists of "standard" ASCII single and double quotes, or are these "special" quotes (maybe UTF8), say from MS Word, where there's an opening quote pointing in and a closing quote pointing the opposite direction? Are you sure this is happening at the preg_replace line? Quote Link to comment Share on other sites More sharing options...
gibs Posted May 2, 2007 Author Share Posted May 2, 2007 I checked it in the database, and it looks fine. Everything is correct. Here is what happens. The user inputs the code into the page, and then the page runs the following function: function inputtext ($input) { $output = htmlentities($input); $output = strip_tags($output); $output = mysql_escape_string($output); return $output; } I then take the output of that function and insert it into the MySQL DB. I know nothing is done wrong there, because when I view the TEXT field in the database with phpMyAdmin it looks perfect. To recall the information, I do a mysql query and fetch_array. I then run the function above to parse the bbcode and echo it out onto the page. I am fairly certain that it is the preg_replace line. Quote Link to comment Share on other sites More sharing options...
Wildbug Posted May 2, 2007 Share Posted May 2, 2007 I checked it in the database, and it looks fine. Everything is correct. ...I then take the output of that function and insert it into the MySQL DB. I know nothing is done wrong there, because when I view the TEXT field in the database with phpMyAdmin it looks perfect. ...I am fairly certain that it is the preg_replace line. You'll have to do more than that to convince me. If I were debugging this, I'd check the variable contents at every juncture, and not just as web browser output, but raw output. Check out this user comment on the PHP manual page for htmlentities(). Maybe you can include that str_replace in your inputtext() function. I'm not sure this is the problem, but it really stinks like a character encoding issue. Quote Link to comment Share on other sites More sharing options...
obsidian Posted May 2, 2007 Share Posted May 2, 2007 I'm not sure this is the problem, but it really stinks like a character encoding issue. I agree. I had a similar problem, and it turned out that the code was being inserted to the database from a UTF-8 encoded page, and I was outputting to an ISO page. Double check to make sure that your encoding is the same. If not, you'll need to encode your string to the new page encoding to get it to display properly. Quote Link to comment Share on other sites More sharing options...
gibs Posted May 2, 2007 Author Share Posted May 2, 2007 I have narrowed it down to it being the mysql_escape_string command. However, if I remove this it'll leave a large security hole. Any suggestions? Quote Link to comment Share on other sites More sharing options...
obsidian Posted May 3, 2007 Share Posted May 3, 2007 I have narrowed it down to it being the mysql_escape_string command. However, if I remove this it'll leave a large security hole. Any suggestions? First off, according to the PHP manual, mysql_escape_string() is deprecated for favor of using mysql_real_escape_string(). This in and of itself most likely will not solve your problem since both functions simply escape your string for use in the query, they do nothing to translate characters into different ones. Quote Link to comment Share on other sites More sharing options...
Wildbug Posted May 3, 2007 Share Posted May 3, 2007 http://en.wikipedia.org/wiki/Smart_quotes And, to reiterate, http://us.php.net/manual/en/function.htmlentities.php#41152 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.