Snowmiser Posted January 23, 2009 Share Posted January 23, 2009 Hi, first post. I don't belong to any php communities. This would be my first. Normally I just hang out in the php channel on freenode, but there's doesn't seem to be a lot of knowledge about XML going around and google searches are unsuccessful. Well my question is why would this consume 93mb of memory parsing just a 4mb xml file and is there any way to optimize it? function xml_into_assoc($xml) { $result = array(); $i = 0; while ($xml->read()) { switch ($xml->nodeType) { case XMLReader::ELEMENT: { $result[$i]['name'] = $xml->name; $result[$i]['value'] = $xml->isEmptyElement ? '' : xml_into_assoc($xml); if ($xml->hasAttributes) { while ($xml->moveToNextAttribute()) { $result[$i]['attributes'][$xml->name] = $xml->value; } } $i++; break; } case XMLReader::END_ELEMENT: { return $result; } case XMLReader::TEXT: { } case XMLReader::CDATA: { $result = $xml->value; break; } } } return $result; } Quote Link to comment Share on other sites More sharing options...
btherl Posted January 23, 2009 Share Posted January 23, 2009 PHP structures are not hugely memory efficient. You can use memory_get_usage() to measure the usage at various points and see how it grows during the parsing. When I'm dealing with enourmous data sets, I often pack things into strings. I regularly get savings of 80-90% of memory used when packing an array of associative arrays into an array of strings (each string can be reconstituted into an associative array when required). Quote Link to comment Share on other sites More sharing options...
Snowmiser Posted January 23, 2009 Author Share Posted January 23, 2009 I gather this far is that you're correct. It appears that consuming large amounts of memory isn't uncharacteristic of any of PHP's XML APIs. I will just parse it manually. I can't justify increasing the memory_limit for 4mb. Quote Link to comment Share on other sites More sharing options...
printf Posted January 23, 2009 Share Posted January 23, 2009 Actually PHP by design handles NODE TYPES (4x) better when you access those types via ARRAY elements. It's well known fact that PHP converts all XML documents to it's optimized array structure. So accessing node types, whether that be posing a simple question, it's better to access each node with the array TYPE element than to use the XML::TYPE constant, because PHP only performs the lookup when that constant is encountered. While it is already in the array structure $xml[$i]['TYPE']! All I am saying is that PHP hacks XML, it doesn't follow the standards, so it better to learn how they hacked the XML standard and design your application to take advantage of their hack instead of doing it the right way, because the right way will only use ridiculously large amounts of memory and leave you shaking your head wondering what you're doing wrong! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.