ravi_sdl Posted June 23, 2012 Share Posted June 23, 2012 Hi, I want to insert large xml feed data into mysql. XML feed size is 450mb. It will take approx 4-5 hrs to insert records in data base. I have write a code which are not working properly. Means it show maximum execution time error. It will insert only 20-25 thousand records into mysql table. I have already set "ini_set" code for maximum_execution_time but it is not working. Anyone can suggest any tips. Please note that script will take 4-5 hrs to execute. Is any method to execute php code in backend or any setting in mysql. Please suggest me about this issue. Regards, Ravi Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/ Share on other sites More sharing options...
Ricky. Posted June 23, 2012 Share Posted June 23, 2012 Do not run it at once, instead you run it in parts, like process 10K records at once then write in database what was the last record processed, on next run, script should check in DB from where to start and then again it will process 10K records. We do such things regularly in PHP on large data. I hope you get the idea Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/#findComment-1356342 Share on other sites More sharing options...
silkfire Posted June 23, 2012 Share Posted June 23, 2012 You should read up a bit on how to split large XML files: chunk - Break A Large XML File Into Manageable Chunks chunk ? Read a large (XML) file a chunk at a time How-to Handle Large XML Files in PHP PHP - Extract a chunk of XML from a larger XML document Parsing Huge XML Files in PHP mod edit: fixed links Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/#findComment-1356361 Share on other sites More sharing options...
PFMaBiSmAd Posted June 23, 2012 Share Posted June 23, 2012 Also, it is very inefficient to run any kind of query inside of a loop. The time it takes php to send the query statement/data (when using prepared statements) to the database server is usually longer than the time it actually takes the database server to insert the data into the table. By forming a multi-value insert query, where you insert as many rows of data as possible (limited by the maximum packet size for one query statement) at once, would be the quickest php based way of doing this. You would typically insert 2k-5k rows at one time. An even more efficient way would be to parse the xml data into a csv file and use a LOAD DATA LOCAL INFILE query to get the data into a table at the highest speed possible. If this is your database server, where you can place the file into a location where the database server can directly read it without going through the client (php), a LOAD DATA INFILE query (without the LOCAL keyword) would produce the fastest possible result. From the mysql documentation - When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using INSERT statements Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/#findComment-1356364 Share on other sites More sharing options...
silkfire Posted June 23, 2012 Share Posted June 23, 2012 PFMabi I've tried the multiple insert query with large XML files and it is ghastly. PHP chokes when it has to deal with an array that is > 2000 items and imagine each array having about 30 elements each. To create the array that generates the query string it took like 5 min, so I guess the CSV option is much more viable. Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/#findComment-1356365 Share on other sites More sharing options...
PFMaBiSmAd Posted June 23, 2012 Share Posted June 23, 2012 The most efficient way of using php and a multi-value insert query, is to get as much data as possible into an array (each entry is a formatted string of data ready to go between the () in the query), then use array_chunk to replace that array with a chunked version of itself (which uses the least amount of available memory and gets php to do the work instead of looping over each piece of data again), the chunk size is the number of rows you want to place into one query. You then loop over the chunks and implode each chunk to produce the data for the query statement. If you exceed the maximum packet size in one query, you will get a 'server has gone away' error from your error checking logic. For test data with only a few fields, I have inserted 50k rows in one query. For typical things you would see posted in the forum, I use a block size of 10k rows. edit: added [m][/m] tags Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/#findComment-1356368 Share on other sites More sharing options...
ravi_sdl Posted June 29, 2012 Author Share Posted June 29, 2012 Thanks for reply. I will use array_chunk() and try to fix my issue. I will back after use array_chunk(). Thanks.. Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/#findComment-1357888 Share on other sites More sharing options...
xyph Posted June 29, 2012 Share Posted June 29, 2012 I remember reading something about this a well, where unless you've designed your filesystem specifically, you don't really gain performance over a certain number of concurrent inserts. I don't think the number was much over a couple thousand rows at a time, where queries per second kind of flat-lined. It might be worth it to do a similar benchmark on your system. You want to find the sweet spot, where you're inserting as few rows as possible at a time with the highest number of queries per second. (Should be very roughly around 1000 rows per query). As PFMaB said, LOAD DATA INFILE is by far the fastest solution. Quote Link to comment https://forums.phpfreaks.com/topic/264650-extract-large-xml-feed-into-mysql-database/#findComment-1357950 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.