Jump to content

Does strlen() get CRIPPLED when operating on strs of odd chars (like FF 0E &c)??


Recommended Posts

Hello experts and others on the way there,

 

I get an AWFUL, deficient count from strlen() when I apply it to a string of chars falling near the ends of the ascii range, like 1, 14, 216 and 255. Take a look at the discrepancy of FOUR below. My version is 5.0.4. I'll be glad to read any feedback on how to get an accurate count, if possible.

 

------------ Output, the source of the resulting browser page: -----------

text_editor_view.jpg

 

--------------------------------The script: --------------------------------

<?php

$nontext1 = file_get_contents("C:/Scratch/oddChars.txt");
$ordinary1 = file_get_contents("C:/Scratch/allPlain.txt");

echo "

php takes up a 34-byte of non-text chars with file_get_contents, sees it as having " .
strlen($nontext1). " bytes; an echo shows that the var contains all 34 even though strlen
says " .strlen($nontext1). " (" .mb_strlen($nontext1). " for mb_strlen). The results are the same
with fread() using the r attribute, and fread() using the rb pair of attributes.
The entire string:
$nontext1 (the 4th from the end is a cr, 0x0D)

Ind 0: " .$nontext1{0}. "
Ind 1: " .$nontext1{1}. "
Ind 2: " .$nontext1{2}. "
Ind 3: " .$nontext1{3}. "
Ind 4: " .$nontext1{4}. "
Ind 5: " .$nontext1{5}. "
Ind 6: " .$nontext1{6}. "
Ind 7: " .$nontext1{7}. "
Ind 8: " .$nontext1{8}. "
Ind 9: " .$nontext1{9}. "
Ind 10: " .$nontext1{10}. "
Ind 11: " .$nontext1{11}. "
Ind 12: " .$nontext1{12}. "
Ind 13: " .$nontext1{13}. "
Ind 14: " .$nontext1{14}. "
Ind 15: " .$nontext1{15}. "
Ind 16: " .$nontext1{16}. "
Ind 17: " .$nontext1{17}. "
Ind 18: " .$nontext1{18}. "
Ind 19: " .$nontext1{19}. "
Ind 20: " .$nontext1{20}. "
Ind 21: " .$nontext1{21}. "
Ind 22: " .$nontext1{22}. "
Ind 23: " .$nontext1{23}. "
Ind 24: " .$nontext1{24}. "
Ind 25: " .$nontext1{25}. "
Ind 26: " .$nontext1{26}. "
Ind 27: " .$nontext1{27}. "
Ind 28: " .$nontext1{28}. "
Ind 29: " .$nontext1{29}. "
Ind 30: " .$nontext1{30}. "
Ind 31: " .$nontext1{31}. "
Ind 32: " .$nontext1{32}. "
Ind 33: " .$nontext1{33}. "


More expectedly, php takes up a 34-byte file of ordinary chars with file_get_contents,
sees it as having " .strlen($ordinary1). " bytes; an echo shows that the var
contains all 34, agreeing with what strlen says as " .strlen($ordinary1). ". The entire string:
$ordinary1

Ind 0: " .$ordinary1{0}. "
Ind 1: " .$ordinary1{1}. "
Ind 2: " .$ordinary1{2}. "
Ind 3: " .$ordinary1{3}. "
Ind 4: " .$ordinary1{4}. "
Ind 5: " .$ordinary1{5}. "
Ind 6: " .$ordinary1{6}. "
Ind 7: " .$ordinary1{7}. "
Ind 8: " .$ordinary1{8}. "
Ind 9: " .$ordinary1{9}. "
Ind 10: " .$ordinary1{10}. "
Ind 11: " .$ordinary1{11}. "
Ind 12: " .$ordinary1{12}. "
Ind 13: " .$ordinary1{13}. "
Ind 14: " .$ordinary1{14}. "
Ind 15: " .$ordinary1{15}. "
Ind 16: " .$ordinary1{16}. "
Ind 17: " .$ordinary1{17}. "
Ind 18: " .$ordinary1{18}. "
Ind 19: " .$ordinary1{19}. "
Ind 20: " .$ordinary1{20}. "
Ind 21: " .$ordinary1{21}. "
Ind 22: " .$ordinary1{22}. "
Ind 23: " .$ordinary1{23}. "
Ind 24: " .$ordinary1{24}. "
Ind 25: " .$ordinary1{25}. "
Ind 26: " .$ordinary1{26}. "
Ind 27: " .$ordinary1{27}. "
Ind 28: " .$ordinary1{28}. "
Ind 29: " .$ordinary1{29}. "
Ind 30: " .$ordinary1{30}. "
Ind 31: " .$ordinary1{31}. "
Ind 32: " .$ordinary1{32}. "
Ind 33: " .$ordinary1{33}. "\n\n\n";

?>

 

Sure, in decimal value they're

 

Ind 0: 255;  Ind 1: 216;  Ind 2: 255;  Ind 3: 224;  Ind 4: 32; 

Ind 5: 16;  Ind 6: 74;  Ind 7: 70;  Ind 8: 73;  Ind 9: 70; 

Ind 10: 32;  Ind 11: 1;  Ind 12: 1;  Ind 13: 1;  Ind 14: 32; 

Ind 15: 1;  Ind 16: 32;  Ind 17: 1;  Ind 18: 32;  Ind 19: 32; 

Ind 20: 255;  Ind 21: 219;  Ind 22: 32;  Ind 23: 67;  Ind 24: 32; 

Ind 25: 20;  Ind 26: 14;  Ind 27: 15;  Ind 28: 18;  Ind 29: 15; 

Ind 30: 13;  Ind 31: 20;  Ind 32: 18;  Ind 33: 16.

 

 

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.