kurbsdude Posted August 28, 2009 Share Posted August 28, 2009 Ok here's my problem, I need to read a text directly from a website ( suppose a certain div value)... the problem is that this value can sometimes be unicode characters (different languages) and sometimes umlauts... now the problem is how to correctly display these characters... if i use htmlentities() unicode displays fine but umlauts are now turned into ����.. and if i use utf8 encode, umlauts display fine while unicode characters are turned into ����... any suggestions? cheers!! Link to comment https://forums.phpfreaks.com/topic/172293-special-characters-problem-weird/ Share on other sites More sharing options...
Mark Baker Posted August 28, 2009 Share Posted August 28, 2009 If the website that you're reading isn't utf-8, then you'll need to convert it from whatever charset it does use to utf-8 Link to comment https://forums.phpfreaks.com/topic/172293-special-characters-problem-weird/#findComment-908415 Share on other sites More sharing options...
kurbsdude Posted August 28, 2009 Author Share Posted August 28, 2009 If the website that you're reading isn't utf-8, then you'll need to convert it from whatever charset it does use to utf-8 that sounds logical, but is there a way to convert to utf-8 from php itself without changing the website? Link to comment https://forums.phpfreaks.com/topic/172293-special-characters-problem-weird/#findComment-908433 Share on other sites More sharing options...
Mark Baker Posted August 28, 2009 Share Posted August 28, 2009 function getCharSetFromMetaTags($strBody) { $returns = array ( "/\n/", "/\r/", "/\t/", "/\s+/" ); $nullReturns = array( ' ', ' ', ' ', ' ' ); preg_match("/<meta\s?[^>]*content\s?=\s?\".*charset\s?=\s?(.*)\"\s?\/?>/Ui", preg_replace($returns,$nullReturns,$strBody), $strCharSet); return (isset($strCharSet[1])) ? strtoupper($strCharSet[1]) : false; } // function getCharSetFromMetaTags() Read $content from remote website $charset = getCharSetFromMetaTags($content); if ($charset === false) { // no idea what charset the remote site is using } elseif ($charset != 'utf-8') { iconv($charset,'UTF-8',$content); } $content is now utf-8 (or unknown) Do with it as thou wilt Link to comment https://forums.phpfreaks.com/topic/172293-special-characters-problem-weird/#findComment-908436 Share on other sites More sharing options...
kurbsdude Posted August 28, 2009 Author Share Posted August 28, 2009 thanks i'll check that out Link to comment https://forums.phpfreaks.com/topic/172293-special-characters-problem-weird/#findComment-908461 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.