kristen Posted May 1, 2008 Share Posted May 1, 2008 I know this is not strictly a PHP question, but I hope someone can help me nonetheless. I have an RSS feed here http://www.childcareaware.org/feeds/aya_sp.rss. The content comes from a database, so the .rss file has some PHP in it to make that happen. It sounds a little weird, but it works fine. The problem is that the content coming in has Spanish characters (e.g. é), which cannot be read by xml parsers, and cause the feed to fail. I think that I need to declare a DTD, but after extensive googling and experimentation, I can't seem to find anything that works. Code is below... (minus the function cca_areyouaware_db_query, for security purposes) <? header('Content-type: text/xml'); ?> <? echo "<?"; ?>xml version="1.0" encoding="ISO-8859-1"<? echo "?>"; ?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title>Are You Aware? En Español</title> <description>A Bi-weekly Feature, Presented By Child Care Aware</description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/</link> <atom:link href="http://www.childcareaware.org/feeds/aya_sp.rss" rel="self" type="application/rss+xml" /> <? $query = "SELECT * FROM sp_articles ORDER BY id DESC LIMIT 10"; $result = cca_areyouaware_db_query($query); while ($row = mysql_fetch_array($result)){ $title_output = preg_replace('/[^\x20-\x7F]+/', '', $row[title]); $body_output = preg_replace('/[^\x20-\x7F]+/', '', htmlentities($row[body])); ?> <item> <title><?= $title_output; ?></title> <description><?= $body_output; ?></description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></link> <guid>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></guid> </item> <? } ?> </channel> </rss> Thanks for any help you can give! Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/ Share on other sites More sharing options...
effigy Posted May 1, 2008 Share Posted May 1, 2008 Is this needed since you've specified ISO-8859-1? $body_output = preg_replace('/[^\x20-\x7F]+/', '', htmlentities($row[body])); Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531305 Share on other sites More sharing options...
kristen Posted May 1, 2008 Author Share Posted May 1, 2008 Probably not... I was just trying a bunch of different combinations to see if I could get anything to work. This doesn't hurt, but doesn't help either. Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531316 Share on other sites More sharing options...
colombian Posted May 1, 2008 Share Posted May 1, 2008 Maybe I am missing something - does the RSS feed need to be ISO-8859-1? You can try UTF-8 which helps with spanish accents and more world-wide characters. Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531373 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 If I switch to UTF-8, I just get different errors. You can see them here: http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.childcareaware.org%2Ffeeds%2Faya_sp.rss Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531725 Share on other sites More sharing options...
effigy Posted May 2, 2008 Share Posted May 2, 2008 ISO-8859-1 should work as long as you don't encode anything. UTF-8 should work as long as everything has been passed through utf8_encode. Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531726 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 Can you be a bit more specific? I changed encoding to UTF-8, added utf8_encode, and it just changes the set of errors - I think because the spanish characters in my code are in the format "É", not É. Thank you for the help, I really appreciate it... this has been on my list of to-dos for over a year, and I just keep getting frustrated and giving up. Hopefully this time I'll get it! New code: <? header('Content-type: text/xml'); ?> <? echo "<?"; ?>xml version="1.0" encoding="UTF-8"<? echo "?>"; ?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title><? echo utf8_encode($feedtitle); ?></title> <description>A Bi-weekly Feature, Presented By Child Care Aware</description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/</link> <atom:link href="http://www.childcareaware.org/feeds/aya_sp.rss" rel="self" type="application/rss+xml" /> <? $query = "SELECT * FROM sp_articles ORDER BY id DESC LIMIT 10"; $result = cca_areyouaware_db_query($query); while ($row = mysql_fetch_array($result)){ $title_output = utf8_encode($row[title]); $body_output = utf8_encode($row[body]); ?> <item> <title><?= $title_output; ?></title> <description><?= $body_output; ?></description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></link> <guid>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></guid> </item> <? } ?> </channel> </rss> Again, validation is here: http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.childcareaware.org%2Ffeeds%2Faya_sp.rss Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531740 Share on other sites More sharing options...
effigy Posted May 2, 2008 Share Posted May 2, 2008 Ahh. So the entities (É, e.g.) are in your database? Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531745 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 Yup Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531746 Share on other sites More sharing options...
darkfreaks Posted May 2, 2008 Share Posted May 2, 2008 htmlspecialchars_decode() might make all the html entities back into characters Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531876 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 For anyone interested, I did end up figuring it out (kind of). It works now, I'm just not sure it is the best way to do it. Here is my final code: <? header('Content-type: application/xml'); ?> <? echo "<?";?>xml version="1.0" encoding="iso-8859-1"<? echo "?>";?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title>Are You Aware En Español</title> <description>A Bi-weekly Feature, Presented By Child Care Aware</description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/</link> <atom:link href="http://www.childcareaware.org/feeds/aya_sp.rss" rel="self" type="application/rss+xml" /> <? $query = "SELECT * FROM sp_articles ORDER BY id DESC LIMIT 10"; $result = cca_areyouaware_db_query($query); while ($row = mysql_fetch_array($result)){ $title_output = $row[title]; $badchars = array("<", ">", "&", "“", "’", "”", "–", "—"); $goodchars = array("<", ">", "&", "'", "'", "'", "-", "-"); $body_output = str_replace($badchars, $goodchars, $row[body]); ?> <item> <title><?= $title_output; ?></title> <description><?= $body_output; ?></description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></link> <guid>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></guid> </item> <? } ?> </channel> </rss> Thanks for the help! Quote Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531968 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.