mvfreelance Posted September 29, 2009 Share Posted September 29, 2009 Hey folks! I'm no noobie, but got a problem here that' is driving me nuts... my enviromment: * php 5.2.10 * apache 2.2 * OS Windows Vista * RewriteEngine On * DefaultCharset UTF-8 When requesting any page and parsing values thru $_GET, all works fine, apart when parsing "french strings" (e.g. Chloé / Chlo%E9 , becomes: Chlo , OR Élle ,becomes: lle ) So I have: page.php <?php header("Content-type: text/html; charset=ISO-8859-1"); print("string(".strlen($_GET['var']) .") ".$_GET['var'] ."\n"); print(urldecode($_SERVER['REQUEST_URI'])); ?> request : page.php?var=ABC%25DE expected: string(6) ABC%DE page.php?var=ABC%25DE output : string(6) ABC%DE page.php?var=ABC%25DE ------------------------------------------------------ request : page.php?var=ABC%DE expected: string(6) ABC%DE page.php?var=ABC%DE output : string(6) ABC%DE page.php?var=ABC%DE ------------------------------------------------------ request : page.php?var=ABCDÉF expected: string(6) ABCDÉF page.php?var=ABCDÉF output : string(5) ABCDF page.php?var=ABCDÉF ------------------------------------------------------ request : page.php?var=ABCD%E9F expected: string(6) ABCDÉF page.php?var=ABCD%E9F output : string(5) ABCDF page.php?var=ABCD%E9F ------------------------------------------------------ So, the "É" (and its urlencode equivalent %E9) were simply ignored by PHP. Got the same results for the code <?php header("Content-type: text/html; charset=ISO-8859-1"); print("string(".mb_strlen(utf8_encode($_GET['var'])) .") ".$_GET['var'] ."\n"); print(urldecode($_SERVER['REQUEST_URI'])); ?> Anyone please??? Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/ Share on other sites More sharing options...
Alex Posted September 29, 2009 Share Posted September 29, 2009 Use UTF-8: <?php header("Content-type: text/html; charset=UTF-8"); print("string(".strlen($_GET['var']) .") ".$_GET['var'] ."\n"); print(urldecode($_SERVER['REQUEST_URI'])); ?> Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927199 Share on other sites More sharing options...
mvfreelance Posted September 29, 2009 Author Share Posted September 29, 2009 My mistake when posting... I did indeed tried to use header(.... utf- .. <?php header("Content-type: text/html; charset=UTF-8"); print("string(".mb_strlen(utf8_encode($_GET['var'])) .") ".$_GET['var'] ."\n"); print(urldecode($_SERVER['REQUEST_URI'])); ?> return the same results.... I'm pretty sure that this is a configurations problem. coz the code works fine when Apache is running under Linux (centos)... thanks for the replay anyways.... Any more suggestions please?? Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927204 Share on other sites More sharing options...
Alex Posted September 29, 2009 Share Posted September 29, 2009 What browser are you using? Because I know that IE has problems with UTF-8. Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927210 Share on other sites More sharing options...
cags Posted September 29, 2009 Share Posted September 29, 2009 I have practically the same setup as you. The only difference being I use XP 64bit rather than Vista. I just tried you code, and it worked perfectly on my computer. %E9 is actually é %C9 is É Thats not really relevant though. Sorry I can't be of any help, just wanted to let you know it works for me (on all modern browsers), so it sounds like a configuration issue. Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927213 Share on other sites More sharing options...
mvfreelance Posted September 29, 2009 Author Share Posted September 29, 2009 no matter the browser I use, the results are the same... I have tried: IE6 IE7 IE8 FF 3.5.3 FF 3.5 curl.exe 7.19.3 plus, the same same code works just fine in a different server.. so def not a browser issue.. any more tips , please??? Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927227 Share on other sites More sharing options...
redarrow Posted September 29, 2009 Share Posted September 29, 2009 /** * Encodes HTML safely for UTF-8. Use instead of htmlentities. * * @param string $var * @return string */ function html_encode($var) { return htmlentities($var, ENT_QUOTES, 'UTF-8') ; } Check through your code for any text-based content-type headers, and append the UTF-8 charset, so the browser knows what it's working with: header('Content-type: text/html; charset=UTF-8') ; You should also repeat this at the top of HTML pages: <meta http-equiv="Content-type" value="text/html; charset=UTF-8" /> Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927235 Share on other sites More sharing options...
mvfreelance Posted September 29, 2009 Author Share Posted September 29, 2009 Redarrow thanks for your input, but... as I exposed in the original post. I test the simple request: page.php?var=abcdeéf the expected result: string(7) abcdeéf page.php?var=abcdeéf page.php , code simply prints the length of the parameter 'var' followed by its value then, prints the requested URL. and no matter if the header is ISO-8859-1 or UTF-8. the output I got is the same, as follow: string(7) abcdef page.php?var=abcdeéf the PHP constant $_SERVER['REQUEST_URl'] works FINE, but $_GET does NOT. really weird <?php header("Content-type: text/html; charset=ISO-8859-1"); print("string(".strlen($_GET['var']) .") ".$_GET['var'] ."\n"); print(urldecode($_SERVER['REQUEST_URI'])); ?> and <?php header("Content-type: text/html; charset=UTF-8"); print("string(".strlen($_GET['var']) .") ".$_GET['var'] ."\n"); print(urldecode($_SERVER['REQUEST_URI'])); ?> Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927335 Share on other sites More sharing options...
cags Posted September 29, 2009 Share Posted September 29, 2009 I suggest you take a look at this aritcle. It seems that many of the inbuilt PHP functions such as strlen just do not work with multi-byte characters. I'm not sure if this is what's causing part of your problem, as your code works on my computer, but it's an interesting read none-the-less and may give you some inspiration. Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927338 Share on other sites More sharing options...
mvfreelance Posted September 29, 2009 Author Share Posted September 29, 2009 I'm fully aware of PHP lack of support and bugs, UTF-8 related ( though it seams that PHP 6 will be fully UTF- as you said, the code works fine in your box (by the way many thanks for taking the time to try it). And it also works fine in a LAMP box I've got. And in a WAMP box of a friend. But my main WAMP box doens't respond as expected. [dammit ] I'm fairly familiar with "internationalization" so well said in the article you recommended (www.phpwact.org/php/i18n/charsets) , that makes me 99% confident that the problem is some configuration in either php.ini or httpd.conf Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927346 Share on other sites More sharing options...
mvfreelance Posted September 30, 2009 Author Share Posted September 30, 2009 Many thanks to AlexWD , cags & redarrow for their inputs! As I suspected at first the problem was configuration related I've got Multibyte String aka mbstring enabled (great extension for internationalization jobs) BUT when encoding_translation is enabled (encoding_translation=On), it will try translate all HTTP requests before sending them into the great PHP engine - roughly speaking and because sometimes it can't translate certain characters, it simply removes those characters from the HTTP request - dunno why... Enables the transparent character encoding filter for the incoming HTTP queries, which performs detection and conversion of the input encoding to the internal character encoding. http://www.php.net/manual/en/mbstring.configuration.php#ini.mbstring.encoding-translation Anyways, really recommend encoding_translation=Off . best 4 all! Link to comment https://forums.phpfreaks.com/topic/175961-solved-php-_get-characters-disappear-or-ignored/#findComment-927400 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.