Jump to content

Impact of UTF-8 on Existing Code


doubledee

Recommended Posts

I would like to support UTF-8 on my website but am unsure - and quite fearful - whether it will break my existing code or not?!

 

I looked at http://us3.php.net/mbstring, but worry that I'm going to miss something.

 

Here is some sample code where I think things could easily break...

// Trim all Form data.
$trimmed = array_map('trim', $_POST);


// ************************
// Validate Form Data.		*
// ************************

// Validate First Name.
if (empty($trimmed['firstName'])){
	// No First Name.
	$errors['firstName'] = 'Enter your First Name.';
}else{
	// First Name Exists.
	if (preg_match('#^[A-Z \'.-]{2,30}$#i', $trimmed['firstName'])){
		// Valid First Name.
		$firstName = $trimmed['firstName'];
	}else{
		// Invalid First Name.
		$errors['firstName'] = 'First Name must be 2-30 characters (A-Z \' . -)';
	}
}//End of VALIDATE FIRST NAME


// Validate Username.
if (empty($trimmed['username'])){
	// No Username.
	$errors['username'] = 'Enter your Username.';
}else{
	// Username Exists.
	if (preg_match('~(?x)							# Comments Mode
					^						# Beginning of String Anchor
					(?=.{8,30}$)				# Ensure Length is 8-30 Characters
					.*						# Match Anything
					$						# End of String Anchor
					~i', $trimmed['username'])){
		// Valid Username.


		// ******************************
		// Check Username Availability.	*
		// ******************************

		// Build query.
		$q1 = 'SELECT id
				FROM member
				WHERE username=?';

		// Prepare statement.
		$stmt1 = mysqli_prepare($dbc, $q1);

		// Bind variable to query.
		mysqli_stmt_bind_param($stmt1, 's', $trimmed['username']);

		// Execute query.
		mysqli_stmt_execute($stmt1);

		// Store results.
		mysqli_stmt_store_result($stmt1);

		// Check # of Records Returned.
		if (mysqli_stmt_num_rows($stmt1)>0){
			// Duplicate Username.
			$errors['username'] = 'This Username is taken.  Try again.';
		}else{
			// Unique Username.
			$username = $trimmed['username'];
		}
	}else{
		// Invalid Username.
		$errors['username'] = 'Username must be 8-30 characters.';
	}
}//End of VALIDATE USERNAME

 

 

Three possible areas where I could run into trouble are with...

 

1.) array_map

 

2.) preg_match

 

3.) Prepared Statements

 

 

For #2, I see there is mb_ereg_match but I am not sure if I just replace my current Regex function with that one, or if there is more involved.

 

I'm not sure if any problems would arise with #1 or #3 or elsewhere, and would really appreciate a second set of eyes on this code!!

 

requinix said switching is a good idea, but I'm pretty freaked out that I'm going to break things and create a big security hole?!  :(

 

Any help would be appreciated!!

 

Thanks,

 

 

Debbie

 

Link to comment
https://forums.phpfreaks.com/topic/258170-impact-of-utf-8-on-existing-code/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.