arandam Posted November 24, 2010 Share Posted November 24, 2010 what is wrong with this code? I want to parse rss content and its giving error that, the input is not corret $xmlDoc = new DOMDocument(); $xml="http://www.pharmamanufacturing.com/index.html?mode=rss"; $xmlDoc->loadxml($xml); //$xmlDoc->load($xml); Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/ Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 Per the PHP manual, loadXML()'s first argument is the XML, not a URL. You want to eliminate that line and just use the load() function. -Dan Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138928 Share on other sites More sharing options...
arandam Posted November 24, 2010 Author Share Posted November 24, 2010 I tried also using load(), but, thats also giving error Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138934 Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 use SimpleXML instead of the DOM. -Dan Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138953 Share on other sites More sharing options...
arandam Posted November 24, 2010 Author Share Posted November 24, 2010 i also tried that, but, its giving error but, if i use the link, http://news.google.com/news?ned=us&topic=h&output=rss, all methods are working. but, only with the link http://www.pharmamanufacturing.com/index.html?mode=rss, its not working. can u have a look, it just takes 5 lines of code. may be i am missing something Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138961 Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 that RSS feed isn't properly encoded, so simplexml_load_file is throwing a UTF-8 encoding error. The errors I'm getting on that feed are either encoding related or CDATA related. It seems your target feed is poorly formed. Parse it by hand, or contact the publisher. The apostrophe in "Genentech's" is not actually an apostrophe, that seems to be the issue. I don't know why simplexml is failing on a CDATA field, but it is. -Dan Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138998 Share on other sites More sharing options...
arandam Posted November 24, 2010 Author Share Posted November 24, 2010 thanks for your investigation now, i have bit peace, because, i was thinking, something is wrong in my coding. what do you mean by 'parse by hand'? Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1139012 Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 Use the string parsing functions to actually break it up by hand rather than trying a pre-build solution. Preg_match_all can handle a well-formed RSS feed like this. Or you could use strpos and substr to manually pull each tag into a variable. It will be tedious, but if the XML is so weird that both XML parsing functions choke on it, it might be your only way. -Dan Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1139021 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.