brm5017 Posted January 5, 2009 Author Share Posted January 5, 2009 I got the php script to run with a 12 meg xml file ( i deleted 75% of the file). This is a problem. I get a feed every day with housing information. Now i'll have to manually edit this xml every day and chop it into smaller parts.. Is there any way for simplexml to open each line of the file one by one as it needs to? or maybe even this idea: do{ Open Media.xml as Text File; $variable1 =(first 100k lines of file) $variable2 =(2nd 100k lines of file) }until end of file then insert the variables into the simple_xml parser w/ loops. This way the server doesn't get over loaded. Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730367 Share on other sites More sharing options...
premiso Posted January 6, 2009 Share Posted January 6, 2009 You can try this, add it to the top of the script: ini_set('memory_limit', '50M'); See if you are allowed to do that, if not you can try it in a .htaccess file: php_value memory_limit 16M Or if possible you can modify the php.ini and change: memory_limit = 16M ; Maximum amount of memory a script may consume (16MB) to memory_limit = 50M ; Maximum amount of memory a script may consume (16MB) And see what that does for you. I would change it back however, if you do not do this on a regular basis. Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730396 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 php memory limit already set to 64M --- WEIRD! this does now make me realize that the core.* dumps that I was seeing earlier were all 64M. does simple xml inflate the file when reading from it? Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730460 Share on other sites More sharing options...
premiso Posted January 6, 2009 Share Posted January 6, 2009 php memory limit already set to 64M --- WEIRD! this does now make me realize that the core.* dumps that I was seeing earlier were all 64M. does simple xml inflate the file when reading from it? Maybe, not sure, try increasing that to 112M and see what happens. Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730472 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 Internal Server error 500 - i dont even know if i'm editing the htaccess right - i just pasted the code right in the middle: php_value memory_limit 112M Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730516 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 This all doesnt matter anyway - the xml feed that i'm downloading right now is at 520M and still going (doesn't tell me how big the file will be until its finished downloading)... ahh i'm clueless with this RETS stuff.. Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730518 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 ay ay ay.... now where do I freaking start... 5.5 MILLION records <RETS ReplyCode="0" ReplyText="V2.3.3 590: Success"> <COUNT Records="5457011" /> <DELIMITER value = "09"/> <COLUMNS> PropItemNumber PropMediaURL ListingID </COLUMNS> Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730535 Share on other sites More sharing options...
PFMaBiSmAd Posted January 6, 2009 Share Posted January 6, 2009 where do I freaking start By filling in the blanks about what the data is, what control you have over what data is in the download or what different downloads there are, how often the data changes (is each download a completely separate set of data, only new data, all data to date...), and what you are trying to accomplish. Simplexml assumes that you have a file or a string with the xml in it (I don't know, but I don't think you can use it on an open stream.) Something else might be a better match, such as the XMLREADER class. Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730542 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 Ok here's a run down of what I'm trying to accomplish: There's a Real Estate Listing server that stores listing infomation (know as the RETS server). RETS has every house that's currently for sale along with all the details for each house. There are a few different tables, but mainly i'll be using 3 tables : PropertyDescription, PropertyMedia, & ListingAgent. The only control I have when selecting what data to download is the column names. For the ListingAgent & PropertyDescription, I simply select all from their server and export to seperate XML files. For the PropertyMedia table, I select the 3 columns which I need for the database (imgURL,ListingID,PictureOrderNumber). The PropertyMedia table is tricky, because each house could have either 1 picture or 20 pictures. Multiply that by 30,000 houses and the amount of data increases pretty quickly. As far as using simplexml, i've already had success with both the PropertyDescription and ListingAgent tables and i've imported those into my database. Perhaps XMLREADER will work with larger files? Any tips with starting this? Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730570 Share on other sites More sharing options...
rhodesa Posted January 6, 2009 Share Posted January 6, 2009 ok...let's try parsing line by line instead of loading the entire thing as a simplexml object: <?php ini_set ("display_errors", "1"); error_reporting(E_ALL); set_time_limit(0); $dbhost = 'localhost'; $dbuser = 'mulhol'; $dbpass = 'multeam'; $dbname = 'mulhol_listings'; $debug = true; $table = 'listing_pictures'; //Change this to the name of the table //Connect to database $conn = mysql_connect($dbhost, $dbuser, $dbpass) or die ('Error connecting to mysql'); mysql_select_db($dbname); print "Connected to $dbname <br>"; //Open XML File $handle = @fopen('Media-Trend.xml', 'r'); if(!$handle) die('Failed to open Media-Trend.xml'); print "XML File Opened<br />"; //Loop Over Lines $columns = null; $count = 0; while (!feof($handle)) { $line = fgets($handle, 4096); //Check for COLUMNS if(!is_array($columns)){ if(preg_match('/<COLUMNS>(.+)<\/COLUMNS>/',$line,$matches)){ $columns = preg_split('/\s+/',trim($matches[1])); //The follow dynamically builds the INSERT statement from the columns $insert = "INSERT INTO $table (".implode(',',$columns).") VALUES (".implode(',',array_fill(0,count($columns),"'%s'")).")"; if($debug) print '<table border="1"><tr><th>'.implode('</th><th>',$columns).'</th></tr>'; } continue; } //Check for DATA if(preg_match('/<DATA>(.+)<\/DATA>/',$line,$matches)){ $parts = preg_split('/\s+/',trim($matches[1])); if($debug){ print '<tr><td>'.implode('</td><td>',$parts).'</td></tr>'; }else{ array_walk($parts,create_function('&$v','$v = mysql_real_escape_string($v);')); //Run all parts through mysql_real_escape_string() while(count($parts) < count($columns)) $parts[] = ''; //Make sure we have enough values array_unshift($parts,$insert); //Add the INSERT command to the beginning $sql = call_user_func_array('sprintf',$parts); //Put it all together mysql_query($sql) or die(mysql_error()); //Run the query } $count++; } } fclose($handle); if($debug){ print "</table>"; print "Found $count rows to go into $table"; }else print "Inserted $count rows into $table"; ?> edit: added set_time_limit(0); Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730595 Share on other sites More sharing options...
rhodesa Posted January 6, 2009 Share Posted January 6, 2009 FYI, i created a 400+MB file with over 3 million records and everything seems to work ok...took a few minutes to finish though Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730602 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 wow, thanks Aaron. Now i've just got to wait 20 minutes for this file to upload then i'll give it a go..looks like it's going to work tho! Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730611 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 I get an out of memory error after about 5 seconds. I checked the table in phpMyAdmin and it seems that 45k records made it in... hmm Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730789 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 Fatal error: Out of memory (allocated 39059456) (tried to allocate 4864 bytes) in /home/mulhol/public_html/xml/media_linebyline.php(48) : runtime-created function on line 1 Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730808 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 I can run in debug mode fine, when I start dealing with the insert, it comes back with the memory error. Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730922 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 Update: the script hangs on line 49: array_walk($parts,create_function('&$v','$v = mysql_real_escape_string($v);')); //Run all parts through mysql_real_escape_string() possible memory leak here? Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-730974 Share on other sites More sharing options...
brm5017 Posted January 6, 2009 Author Share Posted January 6, 2009 I found this on PHP.net in the manual : Can anyone help me out with implementing this? For large files, consider using stream_get_line rather than fgets - it can make a significant difference. $ time yes "This is a test line" | head -1000000 | php -r '$fp=fopen("php://stdin","r"); while($line=stream_get_line($fp,65535,"\n")) { 1; } fclose($fp);' real 0m1.482s user 0m1.616s sys 0m0.152s $ time yes "This is a test line" | head -1000000 | php -r '$fp=fopen("php://stdin","r"); while($line=fgets($fp,65535)) { 1; } fclose($fp);' real 0m7.281s user 0m7.392s sys 0m0.136s Quote Link to comment https://forums.phpfreaks.com/topic/139291-problem-loading-40meg-xml-file-with-simplexml/page/2/#findComment-731071 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.