Jump to content

[SOLVED] PHP $_GET characters disappear or ignored


mvfreelance

Recommended Posts

Hey folks!

 

I'm no noobie, but got a problem here that' is driving me nuts...

my enviromment:

* php 5.2.10

* apache 2.2

* OS Windows Vista

* RewriteEngine On

* DefaultCharset UTF-8

 

When requesting any page and parsing values thru $_GET, all works fine, apart when parsing "french strings" (e.g. Chloé / Chlo%E9 , becomes: Chlo , OR Élle ,becomes: lle )

So I have:

 

page.php

 

<?php
header("Content-type: text/html; charset=ISO-8859-1");
print("string(".strlen($_GET['var']) .") ".$_GET['var'] ."\n");
print(urldecode($_SERVER['REQUEST_URI']));
?>

 

request : page.php?var=ABC%25DE

expected:

  string(6) ABC%DE

  page.php?var=ABC%25DE 

 

output  :

  string(6) ABC%DE

  page.php?var=ABC%25DE 

 

 

------------------------------------------------------

 

request : page.php?var=ABC%DE

expected:

  string(6) ABC%DE

  page.php?var=ABC%DE 

 

output  :

  string(6) ABC%DE

  page.php?var=ABC%DE 

 

------------------------------------------------------

 

request : page.php?var=ABCDÉF

expected:

  string(6) ABCDÉF

  page.php?var=ABCDÉF

 

output  :

  string(5) ABCDF

  page.php?var=ABCDÉF

 

------------------------------------------------------

 

request : page.php?var=ABCD%E9F

expected:

  string(6) ABCDÉF

  page.php?var=ABCD%E9F

 

output  :

  string(5) ABCDF

  page.php?var=ABCD%E9F

 

------------------------------------------------------

 

So, the "É" (and its urlencode equivalent %E9) were simply ignored by PHP.

 

Got the same results for the code

<?php
header("Content-type: text/html; charset=ISO-8859-1");
print("string(".mb_strlen(utf8_encode($_GET['var'])) .") ".$_GET['var'] ."\n");
print(urldecode($_SERVER['REQUEST_URI']));
?>

 

 

Anyone please???

 

Link to comment
Share on other sites

My mistake when posting...

I did indeed tried to use header(.... utf-8) ..

 

<?php
header("Content-type: text/html; charset=UTF-8");
print("string(".mb_strlen(utf8_encode($_GET['var'])) .") ".$_GET['var'] ."\n");
print(urldecode($_SERVER['REQUEST_URI']));
?>

 

return the same results.... I'm pretty sure that this is a configurations problem. coz the code works fine when Apache is running under Linux (centos)...

 

thanks for the replay anyways....

Any more suggestions please??

 

Link to comment
Share on other sites

I have practically the same setup as you. The only difference being I use XP 64bit rather than Vista. I just tried you code, and it worked perfectly on my computer.

 

%E9 is actually é

%C9 is É

 

Thats not really relevant though. Sorry I can't be of any help, just wanted to let you know it works for me (on all modern browsers), so it sounds like a configuration issue.

Link to comment
Share on other sites

 

 

/**

* Encodes HTML safely for UTF-8. Use instead of htmlentities.

*

* @param string $var

* @return string

*/

function html_encode($var)

{

return htmlentities($var, ENT_QUOTES, 'UTF-8') ;

}

 

 

Check through your code for any text-based content-type headers, and append the UTF-8 charset, so the browser knows what it's working with:

 

header('Content-type: text/html; charset=UTF-8') ;

 

You should also repeat this at the top of HTML pages:

 

<meta http-equiv="Content-type" value="text/html; charset=UTF-8" />

Link to comment
Share on other sites

Redarrow thanks for your input, but...

as I exposed in the original post.

I test the simple request:

page.php?var=abcdeéf

 

the expected result:

string(7) abcdeéf

page.php?var=abcdeéf

 

page.php , code simply prints the length of the parameter 'var' followed by its value then,

prints the requested URL.

and no matter if the header is ISO-8859-1 or UTF-8. the output I got is the same, as follow:

string(7) abcdef

page.php?var=abcdeéf

 

the PHP constant $_SERVER['REQUEST_URl'] works FINE, but $_GET does NOT.

really weird

 

<?php
header("Content-type: text/html; charset=ISO-8859-1");
print("string(".strlen($_GET['var']) .") ".$_GET['var'] ."\n");
print(urldecode($_SERVER['REQUEST_URI']));
?>

 

and

 

<?php
header("Content-type: text/html; charset=UTF-8");
print("string(".strlen($_GET['var']) .") ".$_GET['var'] ."\n");
print(urldecode($_SERVER['REQUEST_URI']));
?>

Link to comment
Share on other sites

I suggest you take a look at this aritcle. It seems that many of the inbuilt PHP functions such as strlen just do not work with multi-byte characters. I'm not sure if this is what's causing part of your problem, as your code works on my computer, but it's an interesting read none-the-less and may give you some inspiration.

Link to comment
Share on other sites

I'm fully aware of PHP lack of support and bugs, UTF-8 related ( though it seams that PHP 6 will be fully UTF-8)  8)

 

as you said, the code works fine in your box (by the way many thanks for taking the time to try it). And it also works fine in a LAMP box I've got. And in a WAMP box of a friend.

But my main WAMP box doens't respond as expected. [dammit  >:( ]

 

 

I'm fairly familiar with "internationalization" so well said in the article you recommended (www.phpwact.org/php/i18n/charsets) , that makes me 99%  :confused: confident that the problem is some configuration in either php.ini or httpd.conf

 

 

 

 

Link to comment
Share on other sites

Many thanks to AlexWD , cags & redarrow for their inputs!

 

As I suspected at first the problem was configuration related  :rtfm:  :P

I've got Multibyte String aka mbstring enabled (great extension for internationalization jobs)

BUT when encoding_translation is enabled (encoding_translation=On), it will try translate all HTTP requests before sending them into the great PHP engine - roughly speaking  :D

and because sometimes it can't translate certain characters, it simply removes those characters from the HTTP request - dunno why...

Enables the transparent character encoding filter for the incoming HTTP queries, which performs detection and conversion of the input encoding to the internal character encoding.

http://www.php.net/manual/en/mbstring.configuration.php#ini.mbstring.encoding-translation

 

Anyways, really recommend encoding_translation=Off .

 

best 4 all!

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.