Jump to content

Strange phenomena needs explanation


Logical1

Recommended Posts

I have a simple form with a text box (Var1)  which can recieve 10 characters entered by users.  They can enter either numbers or characters.  However depending on their keyboard they may enter English or other language (UTF8 characters).

I pass the entered value to the next page and try to read the entered data with something like:

$L1=substr($Var1,0,2);

 

If they have entered English characters or numbers I see the first two characters or numberes.  BUt if they have entered UTF8 characters I only see one!

 

I did some tests and it seems like each of UTF8 characters counts for 2!  (so substr($Var1,4,2) will show the second such character.

Can somebody explain why this is and also how can I separete each entered character regardless of being UTF8 or not?

Link to comment
https://forums.phpfreaks.com/topic/161091-strange-phenomena-needs-explanation/
Share on other sites

I did some tests and it seems like each of UTF8 characters counts for 2!  (so substr($Var1,4,2) will show the second such character.

Can somebody explain why this is and also how can I separete each entered character regardless of being UTF8 or not?

Each utf-8 character is one character in length, but may be one or more bytes in length. The substr() function works on bytes. Take a look at the mb_substr() or iconv_substr functions which work on characters.

 

I tried them both and am getting the exact same result.

You can try yourelf:

 

<?php

$Var1="آزمایش";

$L1=mb_substr($Var1,0,1);

print "<p align=left>1<br><font color=blue><b>$L1</font></b><br>";

 

$L2=mb_substr($Var1,0,2);

print "<p align=left>2<br><font color=blue><b>$L2</font></b><br>";

 

 

?>

Tell the mb_substr() function what character encoding it should be expecting:

 

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
</head>
<body>
<?php

if (isset($_POST['Var1'])) {
$Var1 = $_POST['Var1'];
print "<p align=left>Text<br><font color=blue><b>$Var1</font></b><br>";

$L1=mb_substr($Var1,0,1,"UTF-8");
print "<p align=left>1<br><font color=blue><b>$L1</font></b><br>";

$L2=mb_substr($Var1,0,2,"UTF-8");
print "<p align=left>2<br><font color=blue><b>$L2</font></b><br>";
}

?>
<form id="cellGrid" action="crap.php" method="post">
<b>Test Value:</b> <input name="Var1" type="text" size="40" value="<?php echo $Var1; ?>"/>

<br /><input name="submit" type="submit" value="Submit">
</form>


</body>
</html>

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.