kennumen Posted May 30, 2009 Share Posted May 30, 2009 I did a search first, but it seems either nobody's had this problem, and/or they had a similar problem but fixed it doing something I'm already doing. Of course, with my luck, there's a good chance I simply glanced over the solution :-\ This is how I store the titles used in RSS (I also store other text this way, but none of it is used in the RSS): function htmlprocess($s){ if(get_magic_quotes_gpc()) $s = stripslashes($s); return (htmlentities(trim($s),ENT_QUOTES,'UTF-8')); } Here's the RSS code: <?xml version="1.0" encoding="ISO-8859-1" ?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel> <title>GASP Comic.com</title> <link>http://www.gaspcomic.com/</link> <atom:link href="http://www.gaspcomic.com/rss.php" rel="self" type="application/rss+xml" /> <category>Webcomic</category> <copyright>Copyright 2009 GASPcomic.com, Gamesphere.com, Kirsten Vandaele.</copyright> <language>en</language> <generator>GSC CMS</generator> <description>RSS page of GASPcomic.com (listing the last 5 comics published).</description> <item> <title>The first grind</title> <link>http://www.gaspcomic.com/comic.php?id=60</link> <guid>http://www.gaspcomic.com/comic.php?id=60</guid> <description>GASPcomic.com 60, published 19 hours 50 minutes ago.</description> </item> <item> <title>Everyone loves a good grindfest</title> <link>http://www.gaspcomic.com/comic.php?id=59</link> <guid>http://www.gaspcomic.com/comic.php?id=59</guid> <description>GASPcomic.com 59, published 2 days 20 hours 38 minutes ago.</description> </item> <item> <title>I'm listening</title> <link>http://www.gaspcomic.com/comic.php?id=58</link> <guid>http://www.gaspcomic.com/comic.php?id=58</guid> <description>GASPcomic.com 58, published 7 days 23 hours 48 minutes ago.</description> </item> </channel></rss> That part actually works just fine. A while back though I had a comic titled "Ninja pinata", with a squiggly spanish N. htmlentities turned this into "piñata". Apparently, ñ is an undefined item. Go figure. As you can see above, (some) numeric entities like ' are no problem whatsoever. I've looked at the PHP manual though, and found no function like htmlentities, but converting into purely numerical entities. A different topic here prompted me to change the first line's encoding from ISO-8859-1 to UTF-8, with no effect. The MySQL code is straightforward, but it might (i don't know) interest you that i use the utf8_unicode_ci collation to store the data. Do note that i htmlentities my text before putting it into the database, so i'm storing 8 characters "ñ", not one character squiggly "n". I have no way to predict what I (or others, as I might end up opening up this CMS) might input in the future, so I'd like a "catchall" solution. Thanks, Kirsten Link to comment https://forums.phpfreaks.com/topic/160280-rss-and-uncommon-characters/ Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.