mtorbin Posted March 19, 2009 Share Posted March 19, 2009 I have a process that I've build which will load multiple XML documents into memory using simplexml_load_file. The problem is that some of the documents contain data that is screwy. Specifically, we have file URLs that contain +, ˙ and & which obviously causes the simplexml_load_file to fail. Any suggestions on this? Modifying the XML before simplexml_load_file is not an option because we're downloading it from a vendor site. Thanks, - MT Quote Link to comment https://forums.phpfreaks.com/topic/150184-simplexml_load_file-errors-how-do-plan-for-bad-xml/ Share on other sites More sharing options...
JonnoTheDev Posted March 19, 2009 Share Posted March 19, 2009 Modifying it is an option as the xml can be read into a variable prior to being used with simpleXML. For this you will not make use of simplexml_load_file() but simplexml_load_string() Read the contents of the xml file into a variable. Scan for any bad characters and remove. Pass the cleaned xml ino simplexml_load_string() Also XML that contains HTML markup should be contained in a CDATA node. You could specify this in the simpleXML function parameter with the following constants http://uk3.php.net/manual/en/libxml.constants.php Quote Link to comment https://forums.phpfreaks.com/topic/150184-simplexml_load_file-errors-how-do-plan-for-bad-xml/#findComment-788702 Share on other sites More sharing options...
mtorbin Posted March 19, 2009 Author Share Posted March 19, 2009 Neil, Thanks for this! I'll give it a go now. - MT Quote Link to comment https://forums.phpfreaks.com/topic/150184-simplexml_load_file-errors-how-do-plan-for-bad-xml/#findComment-788741 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.