dsaba Posted March 27, 2007 Share Posted March 27, 2007 i have another question related to my main goal of doing this string manipulation ------------------------------------------------------------------------------- TERMS: number-thingy = ᖳ number-unit = ᖳᖳᖳ -------------------------------------------------------------------------------- there is an easier way to do this, but I had no success in doing it my way, i'll tell you what i think the solution was, and what I tried, and how it didn't work יצור מושלם Perfect Creature this here was originally hebrew and english, (the english remains intact)(while each hebrew word has become a number-unit) I want to write this string onto an image with the imagettftext(); function I have successfully done that when the string looks like this: יצור מושלם Perfect Creature as you can see now the string has become: יצור מושלם Perfect Creature so my I'm thinking i need to "decode" the number-units back into their hebrew characters, AS thats the only way the imagettftext() function will take it so i research this encoding format and I find this out: Name HEBREW LETTER FINAL MEM Block Hebrew Category Letter, Other [Lo] Combine 0 BIDI Right-to-Left [R] Mirror N Version Unicode 1.1.0 (June, 1993) Encodings HTML Entity (decimal) ם HTML Entity (hex) ם How to type in Microsoft Windows Alt +05DD UTF-8 (hex) 0xD7 0x9D (d79d) UTF-8 (binary) 11010111:10011101 UTF-16 (hex) 0x05DD (05dd) UTF-16 (decimal) 1,501 UTF-32 (hex) 0x000005DD (05dd) UTF-32 (decimal) 1,501 C/C++/Java source code "\u05DD" Python source code u"\u05DD" so now I know that the number-thingy is in fact encoded in html entity (decimal) so now I try to do this: html_entity_decode($string); - does not work html_specialchars_decode($string) - does not work so my question is how do I successfully convert the encrypted hebrew words back into their utf-8 hebrew characters which they were encoded with in the first place if you're going suggest that an browser will interprete and decode these characters for me and then display them, I am well aware of that, however it is not in the browser where I need the hebrew characters to display -IT is in the actual php script, because I need to feed the imagettftext() function with it, and it is not a browser and does not interpret encrypted hebrew characters -ANY THOUGHTS??? ----------thanks a bunch **EDIT NOTE: If you're testing this out, you need to view the source on the .php page and if you see the html entities in the source then it DID NOT WORK, you should see some kind of gibberish or hebrew characters in the SOURCE then it did work Link to comment https://forums.phpfreaks.com/topic/44549-decode-html-entity-decimal/ Share on other sites More sharing options...
per1os Posted March 27, 2007 Share Posted March 27, 2007 Found this on php.net in the user contrib. Maybe this will work? <?php // also try with get_html_translation_table(HTML_ENTITIES) instead see if that works. function my_htmlspecialchars_decode($text) { return strtr($text, array_flip(get_html_translation_table(HTML_SPECIALCHARS))); } ?> Worth a shot. Link to comment https://forums.phpfreaks.com/topic/44549-decode-html-entity-decimal/#findComment-216389 Share on other sites More sharing options...
dsaba Posted March 27, 2007 Author Share Posted March 27, 2007 i need another function in order to use that, could you link me where you got that? i didn't find it in the list of notes for html_entity_decode() Link to comment https://forums.phpfreaks.com/topic/44549-decode-html-entity-decimal/#findComment-216402 Share on other sites More sharing options...
per1os Posted March 27, 2007 Share Posted March 27, 2007 http://us2.php.net/manual/en/function.htmlspecialchars-decode.php That is a PHP5 only function, but people created their own variations. Link to comment https://forums.phpfreaks.com/topic/44549-decode-html-entity-decimal/#findComment-216405 Share on other sites More sharing options...
dsaba Posted March 27, 2007 Author Share Posted March 27, 2007 i'm completely and utterly confused from reading on php.net various notes and comments I do understand that decoding UTF-8 from html entities cannot be done simply by using the html_entity_decode function like I did earlier It requires many more complications and exceptions, i've tried at least three different custom functions people posted on php.net and none of them work for me, its still in html entities when i'm done sending the string throught the function. in order for me to begin to understand HOW to properly decode html entities into UTF-8, i need to understand WHY you can't simply use the html_entity_decode function by itself and WHY such custom functions (which do not work for me!) are neccesary can anyone shed some light? if it matters, the language that was encoded in html entities is HEBREW and was UTF-8 originally Link to comment https://forums.phpfreaks.com/topic/44549-decode-html-entity-decimal/#findComment-216424 Share on other sites More sharing options...
btherl Posted March 29, 2007 Share Posted March 29, 2007 I have had success with this one (from the html_entity_decode() manual page comments) for decoding greek characters: <? function utf8_replaceEntity($result){ $value = (int)$result[1]; $string = ''; $len = round(pow($value,1/8)); for($i=$len;$i>0;$i--){ $part = ($value & (255>>2)) | pow(2,7); if ( $i == 1 ) $part |= 255<<(8-$len); $string = chr($part) . $string; $value >>= 6; } return $string; } function utf8_html_entity_decode($string){ return preg_replace_callback( '/&#([0-9]+);/u', 'utf8_replaceEntity', $string ); } $string = '’‘ – “ ”' .' ć ń ř' ; $string = utf8_html_entity_decode($string,null,'UTF-8'); header('Content-Type: text/html; charset=UTF-8'); echo '<li>'.$string; ?> Link to comment https://forums.phpfreaks.com/topic/44549-decode-html-entity-decimal/#findComment-217254 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.