inzania Posted May 9, 2008 Share Posted May 9, 2008 I seem to be having some sort of character encoding problem. The basic situation is this: I'm using CURL to retrieve data from an external webpage and then parse the results. The result of this is a Chinese string. So far so good. When I echo the results, though, they come out as �s. Confused, I tried comparing the output to a hard-coded string which is known to exactly match the results. Please see below, assuming $definition has already been set by the aforementioned CURL function. function OrdOut($str) { $out = array(); for($i=0; $i<strlen($str); $i++) { $out[] = dechex(ord($str[$i])); } echo($str."=>".implode(":",$out)."<br>"); } $correct = "歌曲"; OrdOut($correct); OrdOut($definition); This outputs the following: 歌曲=>e6:ad:8c:e6:9b:b2 ����=>b8:e8:c7:fa $definition SHOULD exactly match $correct, but it doesn't seem to. I'm afraid this might have something to do with CURL and parsing Chinese text, and the headers on the page being retrieved not being correct. It strikes me as curious the difference in string lengths between the two of them, which seems to indicate some sort of different encoding, but I could be completely off. I appreciate any help. Quote Link to comment https://forums.phpfreaks.com/topic/104805-php-character-encoding-problem/ Share on other sites More sharing options...
inzania Posted May 9, 2008 Author Share Posted May 9, 2008 I should mention - I tried printing the results of the curl_exec() command without any parsing at all, and had the same problem with all Chinese characters appearing as the question marks. Quote Link to comment https://forums.phpfreaks.com/topic/104805-php-character-encoding-problem/#findComment-536504 Share on other sites More sharing options...
conker87 Posted May 9, 2008 Share Posted May 9, 2008 Are you encoding the page correctly? Quote Link to comment https://forums.phpfreaks.com/topic/104805-php-character-encoding-problem/#findComment-536533 Share on other sites More sharing options...
thebadbad Posted May 9, 2008 Share Posted May 9, 2008 Yea, sounds like the character encoding of the page you're retrieving the data from is different from the char. enc. on your page. Quote Link to comment https://forums.phpfreaks.com/topic/104805-php-character-encoding-problem/#findComment-536566 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.