thehigherentity Posted April 23, 2007 Share Posted April 23, 2007 Im having even more problems with my £ sign Im trying to pass a document through the following code (hoping it will stop any strange symbols being passed onto the page). $document = eregi_replace("[^[:space:][:punct:]a-zA-Z0-9]"," ",$document); I had to use <META http-equiv=Content-Type content="text/html; charset=UTF-8"> in my header to get my £ signs to show correctly and now it gets removed when the document is passed through the above line? everything else i want to allow through is getting there with no problems and i realy dont want to convert anything to html at this point. Can anyone help with this please. I dont care if its done through another bit of code as long as it will still pass everything this line will. Quote Link to comment Share on other sites More sharing options...
dsaba Posted April 23, 2007 Share Posted April 23, 2007 when you set headers for charsets your browser does the converting of characters for you but its important that its in the source code as well you can set the charset by saving it in utf-8 format and you can encode strings with utf8_encode() maybe in your source code its not encoded in utf_8 and doesn't read it correctly? -hope that helps Quote Link to comment Share on other sites More sharing options...
thehigherentity Posted April 23, 2007 Author Share Posted April 23, 2007 Thanks for that, My docunent is saved in utf-8 format and I tried the utf8_encode() but it made even more problems. so for now i have taken the following out $document = eregi_replace("[^[:space:][:punct:]a-zA-Z0-9]"," ",$document); and replaced it with the following, I know this does not do what the above does but it seems to fix many of the errors on my site and my £'s are not being removed, so it will have to do for now i think. $document = str_replace("Â", "", htmlentities($document, ENT_QUOTES)); thanks for your help though Quote Link to comment Share on other sites More sharing options...
dsaba Posted April 23, 2007 Share Posted April 23, 2007 only thing i can add to this as why it was removing it before is maybe one of the characters you're removing liek :space: is part of the utf-8 encoded character it may not seem like it, but i would look up the character in a character map in windows, see what else is part of the character for example to reverse a utf8 string I used this function before, if you understand this function, maybe you will see what i'm getting at: function utf8_strrev($str){ preg_match_all('/./us', $str, $ar); return join('',array_reverse($ar[0])); } apparntly it seems that ".us" is part of every utf-8 encoded character... Quote Link to comment Share on other sites More sharing options...
thehigherentity Posted April 23, 2007 Author Share Posted April 23, 2007 All i can tell u is the £ shows up as £ in the code however I cant even get a str_replace("£","£", $whatever) to work. But... Once i have used the htmlentities() on it i can simply remove the  and i can then display it. It just means i have had to change my code a little later on but, its no real problem now seems to work ok. I would of like to of blocked out all the strange symbols but they all seem to be displaying correctly now so its all good. until i find somthing else that dont show correctly thanks for everyones help though Quote Link to comment Share on other sites More sharing options...
Guest prozente Posted April 23, 2007 Share Posted April 23, 2007 For future reference, you could of done something like this $document = preg_replace('/[^[a-zA-Z0-9\x{00A3}[:space:][:punct:]]/', '', $document); \x{00A3} is £ Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.