muddy Posted February 21, 2007 Share Posted February 21, 2007 Dear Freaks, Any help on this would be greatly appreciated. I have one file: dailymenu.php (the template for the daily menu) All I need to do is include the file "menu.html" (the actual menu contents, exported from MS Word - a requirement) in the dailymenu.php template. However, the menu.html file has all sorts of special characters - due to the fact that french, spanish, and itialian food names contain special characters: example: í, è, ó, â, é, ñ, etc. etc. so when I do a simple: <?php include("includes/menu.html"); ?> in the dailymenu.php file, the output is crapped up by the special characters not being converted. Any way you know to include the menu.html file in dailymenu.php template and fix the special characters all at once? Thanks a million Quote Link to comment https://forums.phpfreaks.com/topic/39504-convert-special-html-characters-on-include/ Share on other sites More sharing options...
Ninjakreborn Posted February 21, 2007 Share Posted February 21, 2007 You could run the actual Microsoft file through html entities when it is uploaded (save a new copy of it). Then when it comes time to show it, create the file again and decode it. The last time I had problems with microsoft word, I just used http://www.byte.com/documents/s=9502/byt1125943459937/0905_pournelle.html and I didn't really have any problems after that. Quote Link to comment https://forums.phpfreaks.com/topic/39504-convert-special-html-characters-on-include/#findComment-190662 Share on other sites More sharing options...
dbrimlow Posted February 21, 2007 Share Posted February 21, 2007 There is a cool php custom "function" I found on the php.net site. It is kind of an UBER-htmlentities <?php // Convert str to UTF-8 (if not already), then convert that to HTML named entities. // and numbered references. Compare to native htmlentities() function. // Unlike that function, this will skip any already existing entities in the string. // mb_convert_encoding() doesn't encode ampersands, so use makeAmpersandEntities to convert those. // mb_convert_encoding() won't usually convert to illegal numbered entities (128-159) unless // there's a charset discrepancy, but just in case, correct them with correctIllegalEntities. function makeSafeEntities($str, $convertTags = 0, $encoding = "") { if (is_array($arrOutput = $str)) { foreach (array_keys($arrOutput) as $key) $arrOutput[$key] = makeSafeEntities($arrOutput[$key],$encoding); return $arrOutput; } else if (!empty($str)) { $str = makeUTF8($str,$encoding); $str = mb_convert_encoding($str,"HTML-ENTITIES","UTF-8"); $str = makeAmpersandEntities($str); if ($convertTags) $str = makeTagEntities($str); $str = correctIllegalEntities($str); return $str; } } // Convert str to UTF-8 (if not already), then convert to HTML numbered decimal entities. // If selected, it first converts any illegal chars to safe named (and numbered) entities // as in makeSafeEntities(). Unlike mb_convert_encoding(), mb_encode_numericentity() will // NOT skip any already existing entities in the string, so use a regex to skip them. function makeAllEntities($str, $useNamedEntities = 0, $encoding = "") { if (is_array($str)) { foreach ($str as $s) $arrOutput[] = makeAllEntities($s,$encoding); return $arrOutput; } else if (!empty($str)) { $str = makeUTF8($str,$encoding); if ($useNamedEntities) $str = mb_convert_encoding($str,"HTML-ENTITIES","UTF-8"); $str = makeTagEntities($str,$useNamedEntities); // Fix backslashes so they don't screw up following mb_ereg_replace // Single quotes are fixed by makeTagEntities() above $str = mb_ereg_replace('\\\\',"\", $str); mb_regex_encoding("UTF-8"); $str = mb_ereg_replace("(?>(&(?:[a-z]{0,4}\w{2,3};|#\d{2,5}))|(\S+?)", "'\\1'.mb_encode_numericentity('\\2',array(0x0,0x2FFFF,0,0xFFFF),'UTF-8')", $str, "ime"); $str = correctIllegalEntities($str); return $str; } } // Convert common characters to named or numbered entities function makeTagEntities($str, $useNamedEntities = 1) { // Note that we should use ' for the single quote, but IE doesn't like it $arrReplace = $useNamedEntities ? array(''','"','<','>') : array(''','"','<','>'); return str_replace(array("'",'"','<','>'), $arrReplace, $str); } // Convert ampersands to named or numbered entities. // Use regex to skip any that might be part of existing entities. function makeAmpersandEntities($str, $useNamedEntities = 1) { return preg_replace("/&(?![A-Za-z]{0,4}\w{2,3};|#[0-9]{2,5};)/m", $useNamedEntities ? "&" : "&", $str); } // Convert illegal HTML numbered entities in the range 128 - 159 to legal couterparts function correctIllegalEntities($str) { $chars = array( 128 => '€', 130 => '‚', 131 => 'ƒ', 132 => '„', 133 => '…', 134 => '†', 135 => '‡', 136 => 'ˆ', 137 => '‰', 138 => 'Š', 139 => '‹', 140 => 'Œ', 142 => 'Ž', 145 => '‘', 146 => '’', 147 => '“', 148 => '”', 149 => '•', 150 => '–', 151 => '—', 152 => '˜', 153 => '™', 154 => 'š', 155 => '›', 156 => 'œ', 158 => 'ž', 159 => 'Ÿ'); foreach (array_keys($chars) as $num) $str = str_replace("&#".$num.";", $chars[$num], $str); return $str; } // Compare to native utf8_encode function, which will re-encode text that is already UTF-8 function makeUTF8($str,$encoding = "") { if (!empty($str)) { if (empty($encoding) && isUTF8($str)) $encoding = "UTF-8"; if (empty($encoding)) $encoding = mb_detect_encoding($str,'UTF-8, ISO-8859-1'); if (empty($encoding)) $encoding = "ISO-8859-1"; // if charset can't be detected, default to ISO-8859-1 return $encoding == "UTF-8" ? $str : @mb_convert_encoding($str,"UTF-8",$encoding); } } // Much simpler UTF-8-ness checker using a regular expression created by the W3C: // Returns true if $string is valid UTF-8 and false otherwise. // From http://w3.org/International/questions/qa-forms-utf-8.html function isUTF8($str) { return preg_match('%^(?: [\x09\x0A\x0D\x20-\x7E] // ASCII | [\xC2-\xDF][\x80-\xBF] // non-overlong 2-byte | \xE0[\xA0-\xBF][\x80-\xBF] // excluding overlongs | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} // straight 3-byte | \xED[\x80-\x9F][\x80-\xBF] // excluding surrogates | \xF0[\x90-\xBF][\x80-\xBF]{2} // planes 1-3 | [\xF1-\xF3][\x80-\xBF]{3} // planes 4-15 | \xF4[\x80-\x8F][\x80-\xBF]{2} // plane 16 )*$%xs', $str); } ?> businessman332211 had the right idea by having you run the upload file through htmlentities. But, instead, run it through the above makeSafeEntities function -- it was created to catch most of the holes in the native php htmlentities function. Quote Link to comment https://forums.phpfreaks.com/topic/39504-convert-special-html-characters-on-include/#findComment-190693 Share on other sites More sharing options...
muddy Posted February 22, 2007 Author Share Posted February 22, 2007 Dear Businessman and DBrimlow, Thanks for the reply! Businessman, when I click on your link, I get sent to some article about Microsoft Activation gripes. Was there another link related to this topic, or do I need to register on that site to read more? DBrimlow, This looks like what I'm looking for. So, how do I "run it through" the makeSafeEntities function. I'm assuming I put this function in the dailymenu.php file, but how do I tell the dailymenu.php file to execute this function on the menu.html file? Thanks again. Quote Link to comment https://forums.phpfreaks.com/topic/39504-convert-special-html-characters-on-include/#findComment-191387 Share on other sites More sharing options...
muddy Posted February 27, 2007 Author Share Posted February 27, 2007 Oh Man, if I could just get this figured out, I could retire! I'll pay if someone can help me figure out how to "run my html file through the makeSafeEntities function" that dbrimlow wrote below. It seems so simple, but I don't know or have the time to figure it out. Thanks Quote Link to comment https://forums.phpfreaks.com/topic/39504-convert-special-html-characters-on-include/#findComment-195280 Share on other sites More sharing options...
muddy Posted March 1, 2007 Author Share Posted March 1, 2007 So, Do I need to post my question in a different location to be able to pay someone and get this simple question answered? What location? Thanks Quote Link to comment https://forums.phpfreaks.com/topic/39504-convert-special-html-characters-on-include/#findComment-197111 Share on other sites More sharing options...
muddy Posted March 4, 2007 Author Share Posted March 4, 2007 Man, Free help isn't what it used to be - Quote Link to comment https://forums.phpfreaks.com/topic/39504-convert-special-html-characters-on-include/#findComment-199370 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.