arandam Posted November 24, 2010 Share Posted November 24, 2010 what is wrong with this code? I want to parse rss content and its giving error that, the input is not corret $xmlDoc = new DOMDocument(); $xml="http://www.pharmamanufacturing.com/index.html?mode=rss"; $xmlDoc->loadxml($xml); //$xmlDoc->load($xml); Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/ Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 Per the PHP manual, loadXML()'s first argument is the XML, not a URL. You want to eliminate that line and just use the load() function. -Dan Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138928 Share on other sites More sharing options...
arandam Posted November 24, 2010 Author Share Posted November 24, 2010 I tried also using load(), but, thats also giving error Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138934 Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 use SimpleXML instead of the DOM. -Dan Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138953 Share on other sites More sharing options...
arandam Posted November 24, 2010 Author Share Posted November 24, 2010 i also tried that, but, its giving error but, if i use the link, http://news.google.com/news?ned=us&topic=h&output=rss, all methods are working. but, only with the link http://www.pharmamanufacturing.com/index.html?mode=rss, its not working. can u have a look, it just takes 5 lines of code. may be i am missing something Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138961 Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 that RSS feed isn't properly encoded, so simplexml_load_file is throwing a UTF-8 encoding error. The errors I'm getting on that feed are either encoding related or CDATA related. It seems your target feed is poorly formed. Parse it by hand, or contact the publisher. The apostrophe in "Genentech's" is not actually an apostrophe, that seems to be the issue. I don't know why simplexml is failing on a CDATA field, but it is. -Dan Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1138998 Share on other sites More sharing options...
arandam Posted November 24, 2010 Author Share Posted November 24, 2010 thanks for your investigation now, i have bit peace, because, i was thinking, something is wrong in my coding. what do you mean by 'parse by hand'? Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1139012 Share on other sites More sharing options...
ManiacDan Posted November 24, 2010 Share Posted November 24, 2010 Use the string parsing functions to actually break it up by hand rather than trying a pre-build solution. Preg_match_all can handle a well-formed RSS feed like this. Or you could use strpos and substr to manually pull each tag into a variable. It will be tedious, but if the XML is so weird that both XML parsing functions choke on it, it might be your only way. -Dan Quote Link to comment https://forums.phpfreaks.com/topic/219690-rss-parsing/#findComment-1139021 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.