mtorbin Posted March 19, 2009 Share Posted March 19, 2009 I have a process that I've build which will load multiple XML documents into memory using simplexml_load_file. The problem is that some of the documents contain data that is screwy. Specifically, we have file URLs that contain +, ˙ and & which obviously causes the simplexml_load_file to fail. Any suggestions on this? Modifying the XML before simplexml_load_file is not an option because we're downloading it from a vendor site. Thanks, - MT Link to comment https://forums.phpfreaks.com/topic/150184-simplexml_load_file-errors-how-do-plan-for-bad-xml/ Share on other sites More sharing options...
JonnoTheDev Posted March 19, 2009 Share Posted March 19, 2009 Modifying it is an option as the xml can be read into a variable prior to being used with simpleXML. For this you will not make use of simplexml_load_file() but simplexml_load_string() Read the contents of the xml file into a variable. Scan for any bad characters and remove. Pass the cleaned xml ino simplexml_load_string() Also XML that contains HTML markup should be contained in a CDATA node. You could specify this in the simpleXML function parameter with the following constants http://uk3.php.net/manual/en/libxml.constants.php Link to comment https://forums.phpfreaks.com/topic/150184-simplexml_load_file-errors-how-do-plan-for-bad-xml/#findComment-788702 Share on other sites More sharing options...
mtorbin Posted March 19, 2009 Author Share Posted March 19, 2009 Neil, Thanks for this! I'll give it a go now. - MT Link to comment https://forums.phpfreaks.com/topic/150184-simplexml_load_file-errors-how-do-plan-for-bad-xml/#findComment-788741 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.