Jump to content

Allowing Special Chars But No HTML?


chaseman

Recommended Posts

I would like to allow special characters in my script like:

 

ஆஇஊஎஐஓ

 

But I don't want to allow HTML like:

<b>bold</b>

 

At first I was using htmlentities() when PRINTING OUT (NOT when or before inserting in the query) .

 

Data shows up in MySQL as &#4555;&#4548;&#4573;&#4576;&#4580;&#4578;&#4582;

 

And it was also printed out that way, I'd like to have it translated into it's actual form (ஆஇஊஎஐஓ), yet 

I don't want <b>bold</b> to be translated as well, I want it to stay <b>bold</b>

 

Is there any way I can do this?

 

I also tried htmlspecialchars() but that didn't do the trick either.

 

If you need script examples let me know.

 

 

EDIT: To bring it to the point, I want it to be just like in this forum post.

 

The special chars are showing up, and the html is just being printed as <b></b>.

Link to comment
Share on other sites

The problem with strip_tags is that it even strips away these type of stuff: (depending in what way you use the tags)

 

</-.-\>

 

Which can end up becoming this:

 

/-.-\

 

Just as an example I'm not on my desktop so I haven't tried it.

 

But I want to ALLOW that too.

 

Does anybody know what type of method this forum uses for the forum posts, so I can simply do the same?

Link to comment
Share on other sites

htmlpurifier allows pretty much everything

What is so funny about that?

Ignore this comment, at first I thought that's a bad joke which I wasn't getting lol, I googled it and found information about html purifier, I'll read into it, and let you know if it did the trick.

Link to comment
Share on other sites

I went to the website of the makers of this forum software and asked in the forum how they do it, this was the response:

 

SMF uses htmlspecialchars $func['htmlspecialchars'] in SMF 1.1.x and $smcFunc['htmlspecialchars'] in SMF 2.0

 

But as the php.net documentation showcases:

 

<?php
$new = htmlspecialchars("<a href='test'>Test</a>", ENT_QUOTES);
echo $new; // <a href=&#039;test&#039;>Test</a>
?>

 

HTML code gets converted and is being printed as such. So I must be doing something wrong in my implementation, I will investigate into this more.

 

I'd appreciate any tips to implement it like they did it, I will have a look into HTML purifier as well.

 

 

EDIT:

Ok it seems that SMF has its own built in function for htmlspecialchairs and that is how it's handles special characters.

 

I haven't learned working with functions so well yet, I'm still a beginner lol :(

Link to comment
Share on other sites

the stuff htmlspecialchars() does you can only see  in the source of the page, not when actually looking at it in the browser. So people can use < > etc but it wouldn't do any harm

 

Test it out and look in your source.

the above exactly.  except for the word would, that should be wouldn't

Link to comment
Share on other sites

I understand what you're saying, in that sense htmlspecialchars works great, HTML does not get executed, the entities are in the source, but the same does not work for ஆஇஊஎஐஓ

Because that gets displayed as something like &#876&#989

 

I think it has to do with the character set.

 

You can try it out for yourself here's an example code:

 

<?php
$data = htmlspecialchars($_POST['data']);

if ($_POST['submit']) {

echo $data; 

}
?>
<form method="POST" action="<?php echo $_SERVER['PHP_SELF']; ?>">
<textarea type="text" name="data"></textarea>
<input type="submit" name="submit" />
</form>

 

if you type in <b>bold</b> you get <b>bold</b>

 

but if you now go in windows to: start/all programs/accessories/system tools/character map

 

and you choose some special chinese/japanese or other characters and you input it and press submit you'll get something like &#3534&#4353 etc.

 

 

Link to comment
Share on other sites

Is your page, MySQL database and PHP all encoded as UTF-8?

The MySQL database charset is in UTF-8, the collation is in UTF-8_General_ci

 

With page, I think you mean what type of setting I have in notepad++? Which is UTF-8 without BOM.

 

I checked over phpinfo() what type of setting php has and it's the default: ISO-8859-1,utf-8;q=0.7,*;q=0.3

 

 

fortnox, thanks for the link, I will read into it.

Link to comment
Share on other sites

The charset suggestion didn't do the trick either, keep in mind I took the MySQL database out of that chain to make it simple, and I experimented with the example script I posted above, meaning I experimented with the encoding settings of notepad++.

 

It doesn't really make sense for my situation anyway, I don't want to allow a specific charset, e.g. only chinese or japanese, instead I want to allow all characters, except there's a charset called UNIVERSAL which I don't know of.

 

I guess the suggestion on the developer forum of these forums softwares was the most on point, they told me they wrote their own function $func['htmlspecialchars']. I just wish I knew how such a function would look like, I'm not there yet that I can write my own function, I need a bit more time for that, I just started 2 month ago, with no programming background.

 

I'll just take a book and read into functions then, maybe I'll figure it out one day.

 

Thanks for all the help though, this forum stands above all.

Link to comment
Share on other sites

You have to open your file in notepad, and re-save it with UTF-8 encoding.

 

Use htmlspecialchars().

 

// Example
<?php
// EDIT
$string = 'ஆஇஊஎஐஓ'; ####### SEE BELOW

echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

$html = get_html_translation_table(HTML_SPECIALCHARS, ENT_QUOTES);

echo '<pre>';
print_r($html);
echo '</pre>';
?>

####### This is what you actually write.

$string = 'ஆஇஊஎஐஓ';

Link to comment
Share on other sites

You have to open your file in notepad, and re-save it with UTF-8 encoding.

 

Use htmlspecialchars().

 

// Example
<?php
// EDIT
$string = 'ஆஇஊஎஐஓ'; ####### SEE BELOW

echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

$html = get_html_translation_table(HTML_SPECIALCHARS, ENT_QUOTES);

echo '<pre>';
print_r($html);
echo '</pre>';
?>

####### This is what you actually write.

$string = 'ஆஇஊஎஐஓ';

 

Thank you a lot for this solution, somebody else gave me an alternate solution which is:

header("Content-type: text/html; charset=utf-8");  

 

and/or

 

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

 

This works too.

 

I *thought* everything was at UTF-8 at default, but the PHP manual about htmlspecialchars says this about charset: "The default character set is ISO-8859-1."

 

I read that many times but didn't register it so clearly.

 

ISO-8859-1 is not enough for what I'm trying to accomplish, thus I have to manually set the correct charset.

 

 

-SOLVED-

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.