milosh Posted September 3, 2012 Share Posted September 3, 2012 Hello, everybody, this is my first post to this great forum. I have tried to solve this problem by myself but failed. User can input a string in my html form. I want to "clean" it so it contains only a-z, A-Z, 0-9, space, and only these latin letters U+010C, U+010D, U+0106, U+0107, U+0110, U+0111, U+0160, U+0161, U+017D, U+017E (ČčĆćĐđ????). 1. I thought I know to deal with the first part of my problem $cleanstring = preg_replace('/[^a-zA-Z0-9 ]/', '', $originalstring); but unfortunately if, for example, $originalstring = 'Đakče', $cleanstring will be '272ak269e'. I do not understand why preg_replace will not duplicate 'Đ' and 'č' chars in return string. 2. Even if I knew how to solve this previous problem, I would still not know how to include ČčĆćĐđ???? in my preg_replace search pattern. Is there a way to include them one by one as a hex values, or some other solution? Edit: For test purposes I tried using str_replace('Đ', 'Ok', $originalstring), but it does not work, return string is the same as original. Why? Can anybody help? Thank you in advance! milosh Quote Link to comment https://forums.phpfreaks.com/topic/267959-how-to-remove-everything-but-standard-alphanumeric-and-some-extra-latin-letters/ Share on other sites More sharing options...
Christian F. Posted September 3, 2012 Share Posted September 3, 2012 This problem requires learning a bit about UTF-8 (which you seem to have done) and how that affects Regular Expressions. Since you're working with UTF-8 strings, then you'll need to use the u modifier for the pattern. Then you'll need to use Unicode Escape Sequences to determine what characters from the UTF-8 set you want to allow. That said, normally you do not want to manipulate user input before storing it in the database. That can lead to problems for the user, seeing as s/he has no idea what's happening in the background. What you do want to do, however, is to validate the input. If the validation fails, show an error message telling exactly why it failed. That'll allow the user to fix the problem themselves, and thus know exactly what is happening (and why). Quote Link to comment https://forums.phpfreaks.com/topic/267959-how-to-remove-everything-but-standard-alphanumeric-and-some-extra-latin-letters/#findComment-1374952 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.