Jump to content

XML Problem - loading in XML


ttam

Recommended Posts

I'm pulling my hair out with this one!

 

We've got a script which pulls in a few XML feeds from various places, it has always worked fine, it uses a DomDocument and uses the standard $xml->load(http://www.feedhere.com) type syntax.

 

We've just had a new feed which we need to add, it fails straight away at this point.

 

Looking into it it appears to be due to the XML having invalid characters in it - I've tried to convince the supplier to fix the data - they are aware of it but say tough!

 

To get round this I used file_get_contents() to fetch the XML contents into a string, I've then tried a variety of things to clean it up with no success (or limited success). I then use the command $xml->loadXML($xmlstring) which quite often still fails due to invalid XML.

 

Has anyone else had this same problem?

 

I've searched the web and found numerous solutions some partially worked and some feeds would work, but others still wouldn't work etc, here are some of what I've tried:

 

$strxml = htmlentities($strxml, ENT_QUOTES)

$strxml = iconv("ISO-8859-1", "ISO-8859-1//IGNORE", $strxml);

$strxml = iconv("UTF-8","UTF-8//IGNORE",$strxml);

 

$strxml = str_replace("&", "&", $strxml);

$strxml = str_replace("<", "<", $strxml);

$strxml = str_replace(">", ">", $strxml);

$strxml = str_replace('"', '"', $strxml);

 

 

Any further ideas? Maybe a combination?

 

The first thing I spotted which is thowing it out is that where you have some elements they have a description which sometimes contains quotes so I get:

 

<Result title="This is a Title" description="This is a quote " inside a description">

 

Which you can see obviously throws it out.

 

Also & is throwing it out as is £

 

 

HELP!

 

Many thanks

 

 

 

 

Link to comment
https://forums.phpfreaks.com/topic/183742-xml-problem-loading-in-xml/
Share on other sites

If you can provide a shortish example feed that contains as many of the problems as possible, I'll see what I can do. I don't know what data they contain so if it's private just switch it out, the actual content isn't important to me, just the layout.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.