kristen Posted May 1, 2008 Share Posted May 1, 2008 I know this is not strictly a PHP question, but I hope someone can help me nonetheless. I have an RSS feed here http://www.childcareaware.org/feeds/aya_sp.rss. The content comes from a database, so the .rss file has some PHP in it to make that happen. It sounds a little weird, but it works fine. The problem is that the content coming in has Spanish characters (e.g. é), which cannot be read by xml parsers, and cause the feed to fail. I think that I need to declare a DTD, but after extensive googling and experimentation, I can't seem to find anything that works. Code is below... (minus the function cca_areyouaware_db_query, for security purposes) <? header('Content-type: text/xml'); ?> <? echo "<?"; ?>xml version="1.0" encoding="ISO-8859-1"<? echo "?>"; ?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title>Are You Aware? En Español</title> <description>A Bi-weekly Feature, Presented By Child Care Aware</description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/</link> <atom:link href="http://www.childcareaware.org/feeds/aya_sp.rss" rel="self" type="application/rss+xml" /> <? $query = "SELECT * FROM sp_articles ORDER BY id DESC LIMIT 10"; $result = cca_areyouaware_db_query($query); while ($row = mysql_fetch_array($result)){ $title_output = preg_replace('/[^\x20-\x7F]+/', '', $row[title]); $body_output = preg_replace('/[^\x20-\x7F]+/', '', htmlentities($row[body])); ?> <item> <title><?= $title_output; ?></title> <description><?= $body_output; ?></description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></link> <guid>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></guid> </item> <? } ?> </channel> </rss> Thanks for any help you can give! Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/ Share on other sites More sharing options...
effigy Posted May 1, 2008 Share Posted May 1, 2008 Is this needed since you've specified ISO-8859-1? $body_output = preg_replace('/[^\x20-\x7F]+/', '', htmlentities($row[body])); Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531305 Share on other sites More sharing options...
kristen Posted May 1, 2008 Author Share Posted May 1, 2008 Probably not... I was just trying a bunch of different combinations to see if I could get anything to work. This doesn't hurt, but doesn't help either. Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531316 Share on other sites More sharing options...
colombian Posted May 1, 2008 Share Posted May 1, 2008 Maybe I am missing something - does the RSS feed need to be ISO-8859-1? You can try UTF-8 which helps with spanish accents and more world-wide characters. Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531373 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 If I switch to UTF-8, I just get different errors. You can see them here: http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.childcareaware.org%2Ffeeds%2Faya_sp.rss Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531725 Share on other sites More sharing options...
effigy Posted May 2, 2008 Share Posted May 2, 2008 ISO-8859-1 should work as long as you don't encode anything. UTF-8 should work as long as everything has been passed through utf8_encode. Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531726 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 Can you be a bit more specific? I changed encoding to UTF-8, added utf8_encode, and it just changes the set of errors - I think because the spanish characters in my code are in the format "É", not É. Thank you for the help, I really appreciate it... this has been on my list of to-dos for over a year, and I just keep getting frustrated and giving up. Hopefully this time I'll get it! New code: <? header('Content-type: text/xml'); ?> <? echo "<?"; ?>xml version="1.0" encoding="UTF-8"<? echo "?>"; ?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title><? echo utf8_encode($feedtitle); ?></title> <description>A Bi-weekly Feature, Presented By Child Care Aware</description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/</link> <atom:link href="http://www.childcareaware.org/feeds/aya_sp.rss" rel="self" type="application/rss+xml" /> <? $query = "SELECT * FROM sp_articles ORDER BY id DESC LIMIT 10"; $result = cca_areyouaware_db_query($query); while ($row = mysql_fetch_array($result)){ $title_output = utf8_encode($row[title]); $body_output = utf8_encode($row[body]); ?> <item> <title><?= $title_output; ?></title> <description><?= $body_output; ?></description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></link> <guid>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></guid> </item> <? } ?> </channel> </rss> Again, validation is here: http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.childcareaware.org%2Ffeeds%2Faya_sp.rss Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531740 Share on other sites More sharing options...
effigy Posted May 2, 2008 Share Posted May 2, 2008 Ahh. So the entities (É, e.g.) are in your database? Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531745 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 Yup Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531746 Share on other sites More sharing options...
darkfreaks Posted May 2, 2008 Share Posted May 2, 2008 htmlspecialchars_decode() might make all the html entities back into characters Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531876 Share on other sites More sharing options...
kristen Posted May 2, 2008 Author Share Posted May 2, 2008 For anyone interested, I did end up figuring it out (kind of). It works now, I'm just not sure it is the best way to do it. Here is my final code: <? header('Content-type: application/xml'); ?> <? echo "<?";?>xml version="1.0" encoding="iso-8859-1"<? echo "?>";?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title>Are You Aware En Español</title> <description>A Bi-weekly Feature, Presented By Child Care Aware</description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/</link> <atom:link href="http://www.childcareaware.org/feeds/aya_sp.rss" rel="self" type="application/rss+xml" /> <? $query = "SELECT * FROM sp_articles ORDER BY id DESC LIMIT 10"; $result = cca_areyouaware_db_query($query); while ($row = mysql_fetch_array($result)){ $title_output = $row[title]; $badchars = array("<", ">", "&", "“", "’", "”", "–", "—"); $goodchars = array("<", ">", "&", "'", "'", "'", "-", "-"); $body_output = str_replace($badchars, $goodchars, $row[body]); ?> <item> <title><?= $title_output; ?></title> <description><?= $body_output; ?></description> <link>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></link> <guid>http://www.childcareaware.org/sp/subscriptions/areyouaware/article.php?id=<?= $row['id']; ?></guid> </item> <? } ?> </channel> </rss> Thanks for the help! Link to comment https://forums.phpfreaks.com/topic/103773-solved-spanish-characters-in-rss-cause-validation-failure/#findComment-531968 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.