Jump to content

array utf8 display problem


vladspy

Recommended Posts

Hi,

 
So I have this code...
The problem is that $data is correctly utf8, because i echo it and can see the correct string. However when I pass it to $name, this charset is broken.
If i wish now to display this new array, it shows question marks on the black diamond shaped figure.
 
Could you please maybe help me debug this code?
The php file has this in the header.
<meta charset="UTF-8">
<meta http-equiv="Content-type" content="text/html; charset=UTF-8">
<?php
$data = array();
$inc = 0;
$handle = @fopen("content_realisations.php", "r");
if ($handle) {
    while (($buffer = fgets($handle, 4096)) !== false) {
        $data[$inc] = ($buffer);
$inc = $inc+1;
    }
    if (!feof($handle)) {
        echo "Error: unexpected fgets() fail\n";
    }
    fclose($handle);
}

?>



<?php

//for ($i=0; $i<$inc; $i++){
$name = get_data($data, $inc);
//echo utf8_encode($name[5][2]);
echo $name[5][2];
// echo $values[1] . "<br>";
// echo $values[2] . "<br>";
  echo '<pre>'; print_r($name); echo '</pre>';
//}

//echo $inc; $length = strlen(utf8_decode($data[22])); echo $length . "<br>"; echo $data[22][$length-3];

function get_data($data, $inc){
for ($row=0; $row<$inc; $row++){
if ($mode == 0){
$z=0; $y=0; $w=0;
$data2 = array();
for($i=0 ; $i< utf8_decode(strlen($data[$row])) ; $i++){
if (($data[$row][$i] == '>') and ($z < 3)){
$z++;
$data_start = $i;
//echo $i . "<br>";
}
if (($data[$row][$i] == '<') and ($y < 4)){
$y++;
$data_end = $i;
//echo $i . "<br>";
} 
if ($data[$row][$i] == '"'){
$data2[$w] = $i; 
$w++;
} 
}
$file = substr($data[$row], $data2[2]+1, $data2[3]-$data2[2]-1);
$thumb = substr($data[$row], $data2[6]+1, $data2[7]-$data2[6]-1);
$id = substr($data[$row], $data2[0]+1, $data2[1]-$data2[0]-1);

//echo $id . "<br>";echo $file . "<br>";echo $thumb . "<br>";echo '<pre>'; print_r($data2); echo '</pre>';
$s = $data_start+1;
//echo $z . "<br>"; echo $y . "<br>"; echo $row . "<br>";
if ($s < $data_end and $z!=0 and $y!=0 and $id == "im")  {
//$name[$row][0] = $row; //must change the index !!!!
while($s != $data_end){
$name[$row][$s-$data_start] = $data[$row][$s];
echo $data[$row][$s];
$s++;
} 
}
}

$length = strlen(utf8_decode($data[$row])); 
$a1=$data[$row][0] . $data[$row][1] . $data[$row][2]; 
//echo '<pre>'; print_r($a1); echo '</pre>';
//echo gettype($a1[0]), "\n";
$is_match = (similar_text($a1, "<!-") == 3) ;
if ($is_match == 1){
//echo "1";
$mode = 1;
}else{
if (similar_text($a1, "-->") == 3 or similar_text($data[$row][$length-3] . $data[$row][$length-2], "->") == 2){
$mode = 0;
}
//echo "0"; 
}
//echo $file . "<br>";
//echo $thumb . "<br>"; 
}
    return $name; 
}


?>

 

Link to comment
https://forums.phpfreaks.com/topic/292246-array-utf8-display-problem/
Share on other sites

$iThat does not make sense. strlen() gives you the number of bytes in the string, and then you utf8_decode() that number?

 

With multibyte strings you cannot use functions like strlen() or even use offsets, like [$i]. You also should not be utf8_decode()ing the string because what actually happens is PHP converts it from UTF-8 to ISO 8859-1 and you'll lose characters.

 

The whole function needs to be rewritten. Can't use strlen, substr, offsets, utf8_decode... Think you can handle that?

Thanks for the feedback. I am quite lost then... why are all the manipulations on the strings work... but when I get to that particular array $name everything gets ruined. Bellow you can see a screenshot of what I am seeing? It's a copy paste... You can see that the first string is echoed and is $data... and the data is shown correctly, the last item with the question mark is the $name[5][2]

the $name is the array that is echoed bellow .

 

I agree with the tf8_decode(strlen($data[$row]))... it was a left over from some tests.

 

C WhitePure Club Med Gym BastilleCitadines Louvre SuiteHôtel N'vy GenèveSofitel Casablanca Tour BlancheRestaurant L'instant d'Or, ParisClub Med ValmorelThalazur Cabourg�

 

Array
(
[2] => Array
(
[1] => C
[2] =>
[3] => W
[4] => h
[5] => i
[6] => t
[7] => e
)

[3] => Array
(
[1] => P
[2] => u
[3] => r
[4] => e
[5] =>
[6] => C
[7] => l
[8] => u
[9] => b
[10] =>
[11] => M
[12] => e
[13] => d
[14] =>
[15] => G
[16] => y
[17] => m
[18] =>
[19] => B
[20] => a
[21] => s
[22] => t
[23] => i
[24] => l
[25] => l
[26] => e
)

[4] => Array
(
[1] => C
[2] => i
[3] => t
[4] => a
[5] => d
[6] => i
[7] => n
[8] => e
[9] => s
[10] =>
[11] => L
[12] => o
[13] => u
[14] => v
[15] => r
[16] => e
[17] =>
[18] => S
[19] => u
[20] => i
[21] => t
[22] => e
)

[5] => Array
(
[1] => H
[2] => �
[3] => �
[4] => t
[5] => e
[6] => l
[7] =>
[8] => N
[9] => '
[10] => v
[11] => y
[12] =>
[13] => G
[14] => e
[15] => n
[16] => �
[17] => �
[18] => v
[19] => e
)

[6] => Array
(
[1] => S
[2] => o
[3] => f
[4] => i
[5] => t
[6] => e
[7] => l
[8] =>
[9] => C
[10] => a
[11] => s
[12] => a
[13] => b
[14] => l
[15] => a
[16] => n
[17] => c
[18] => a
[19] =>
[20] => T
[21] => o
[22] => u
[23] => r
[24] =>
[25] => B
[26] => l
[27] => a
[28] => n
[29] => c
[30] => h
[31] => e
)

[7] => Array
(
[1] => R
[2] => e
[3] => s
[4] => t
[5] => a
[6] => u
[7] => r
[8] => a
[9] => n
[10] => t
[11] =>
[12] => L
[13] => '
[14] => i
[15] => n
[16] => s
[17] => t
[18] => a
[19] => n
[20] => t
[21] =>
[22] => d
[23] => '
[24] => O
[25] => r
[26] => ,
[27] =>
[28] => P
[29] => a
[30] => r
[31] => i
[32] => s
)

[8] => Array
(
[1] => C
[2] => l
[3] => u
[4] => b
[5] =>
[6] => M
[7] => e
[8] => d
[9] =>
[10] => V
[11] => a
[12] => l
[13] => m
[14] => o
[15] => r
[16] => e
[17] => l
)

[9] => Array
(
[1] => T
[2] => h
[3] => a
[4] => l
[5] => a
[6] => z
[7] => u
[8] => r
[9] =>
[10] => C
[11] => a
[12] => b
[13] => o
[14] => u
[15] => r
[16] => g
)

)

You're asking why everything seems to be working up until the point that it doesn't work?

 

As I said, offsets and those functions work on individual bytes. Characters in UTF-8 strings can be one byte (like in most of those names) but they could be up to four bytes. The code will break on strings that have any of those.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.