Jump to content

How to remove everything but standard alphanumeric and some extra latin letters?


milosh

Recommended Posts

Hello, everybody, this is my first post to this great forum.

 

I have tried to solve this problem by myself but failed.

 

User can input a string in my html form. I want to "clean" it so it contains only a-z, A-Z, 0-9, space, and only these latin letters U+010C, U+010D, U+0106, U+0107, U+0110, U+0111, U+0160, U+0161, U+017D, U+017E (ČčĆćĐđ????).

 

1. I thought I know to deal with the first part of my problem

$cleanstring = preg_replace('/[^a-zA-Z0-9 ]/', '', $originalstring);

but unfortunately if, for example, $originalstring = 'Đakče', $cleanstring will be '272ak269e'. I do not understand why preg_replace will not duplicate 'Đ' and 'č' chars in return string.

 

2. Even if I knew how to solve this previous problem, I would still not know how to include ČčĆćĐđ???? in my preg_replace search pattern. Is there a way to include them one by one as a hex values, or some other solution?

 

Edit: For test purposes I tried using str_replace('Đ', 'Ok', $originalstring), but it does not work, return string is the same as original. Why?

 

Can anybody help?

Thank you in advance!

milosh

Link to comment
Share on other sites

This problem requires learning a bit about UTF-8 (which you seem to have done) and how that affects Regular Expressions.

Since you're working with UTF-8 strings, then you'll need to use the u modifier for the pattern. Then you'll need to use Unicode Escape Sequences to determine what characters from the UTF-8 set you want to allow.

 

That said, normally you do not want to manipulate user input before storing it in the database. That can lead to problems for the user, seeing as s/he has no idea what's happening in the background. What you do want to do, however, is to validate the input. If the validation fails, show an error message telling exactly why it failed. That'll allow the user to fix the problem themselves, and thus know exactly what is happening (and why).

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.