Logical1 Posted June 5, 2009 Share Posted June 5, 2009 I have a simple form with a text box (Var1) which can recieve 10 characters entered by users. They can enter either numbers or characters. However depending on their keyboard they may enter English or other language (UTF8 characters). I pass the entered value to the next page and try to read the entered data with something like: $L1=substr($Var1,0,2); If they have entered English characters or numbers I see the first two characters or numberes. BUt if they have entered UTF8 characters I only see one! I did some tests and it seems like each of UTF8 characters counts for 2! (so substr($Var1,4,2) will show the second such character. Can somebody explain why this is and also how can I separete each entered character regardless of being UTF8 or not? Link to comment https://forums.phpfreaks.com/topic/161091-strange-phenomena-needs-explanation/ Share on other sites More sharing options...
Mark Baker Posted June 5, 2009 Share Posted June 5, 2009 I did some tests and it seems like each of UTF8 characters counts for 2! (so substr($Var1,4,2) will show the second such character. Can somebody explain why this is and also how can I separete each entered character regardless of being UTF8 or not? Each utf-8 character is one character in length, but may be one or more bytes in length. The substr() function works on bytes. Take a look at the mb_substr() or iconv_substr functions which work on characters. Link to comment https://forums.phpfreaks.com/topic/161091-strange-phenomena-needs-explanation/#findComment-850115 Share on other sites More sharing options...
Logical1 Posted June 5, 2009 Author Share Posted June 5, 2009 I tried them both and am getting the exact same result. You can try yourelf: <?php $Var1="آزمایش"; $L1=mb_substr($Var1,0,1); print "<p align=left>1<br><font color=blue><b>$L1</font></b><br>"; $L2=mb_substr($Var1,0,2); print "<p align=left>2<br><font color=blue><b>$L2</font></b><br>"; ?> Link to comment https://forums.phpfreaks.com/topic/161091-strange-phenomena-needs-explanation/#findComment-850136 Share on other sites More sharing options...
Mark Baker Posted June 8, 2009 Share Posted June 8, 2009 Tell the mb_substr() function what character encoding it should be expecting: <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=UTF-8" /> </head> <body> <?php if (isset($_POST['Var1'])) { $Var1 = $_POST['Var1']; print "<p align=left>Text<br><font color=blue><b>$Var1</font></b><br>"; $L1=mb_substr($Var1,0,1,"UTF-8"); print "<p align=left>1<br><font color=blue><b>$L1</font></b><br>"; $L2=mb_substr($Var1,0,2,"UTF-8"); print "<p align=left>2<br><font color=blue><b>$L2</font></b><br>"; } ?> <form id="cellGrid" action="crap.php" method="post"> <b>Test Value:</b> <input name="Var1" type="text" size="40" value="<?php echo $Var1; ?>"/> <br /><input name="submit" type="submit" value="Submit"> </form> </body> </html> Link to comment https://forums.phpfreaks.com/topic/161091-strange-phenomena-needs-explanation/#findComment-851405 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.