phpjim Posted January 17, 2013 Share Posted January 17, 2013 I have a text file that I wrote a script to read line by line and each line equals a new record in a table containing only 5 fields and the longest field is 17 character. This a simple take a line, substr it apart and insert a record. I use to be a Visual Foxpro programmer and this would take 2 minutes. This simple task takes 1 1/2 hours or about a 1000 records added per minute. I am running this on my developement laptop using WIndows IIS. Other information I can pass along is the storage engine is InnoDB, the file is about 5 megs, the Collation is utf8_general_ci and the row format is compact. The file has two indexes that are created type Btree. One is on a 5 character zipcode filed and the other index is a 2 char state field and a 3 char code field for a total of 5 characters. Everything about these files are small and I would think the process in MYSQL wouldn't be so long. Any ideas or thoughts would be appreciated? Thanks Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/ Share on other sites More sharing options...
kicken Posted January 17, 2013 Share Posted January 17, 2013 You should temporarily disable the indexes on the table prior to do trying to insert all the records. You can do this using ALTER TABLE Also if you are not already doing so, you should be using the bulk insert syntax rather than doing one insert per row. INSERT INTO table VALUES (...), (...), (...) Lastly you might check your configuration for mysql. It may be configured to not use very much memory (I think this is typical for a development install) which means it might be doing a lot of disk I/O. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406342 Share on other sites More sharing options...
Christian F. Posted January 17, 2013 Share Posted January 17, 2013 If you post the code you wrote, plus a (small) example of the syntax if the source file, we'll be able to provide you with lots more information on where the bottlenecks might be. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406361 Share on other sites More sharing options...
Muddy_Funster Posted January 17, 2013 Share Posted January 17, 2013 Yeah, let's see what you've got, that is taking waaaaaaayyyy too long, and to be honest I would have expected something to timeout by then. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406415 Share on other sites More sharing options...
phpjim Posted January 17, 2013 Author Share Posted January 17, 2013 sample of text from first few lines of 61000 line file - In file, the positions line up though it doesn't appear that way from the cut and past below. As you will notice, a very small, simply script that is taking forever. Removing the debug lines to the screen made no difference. 0236005010000HOLTSVILLE NY103SUFFOLK N 0446006010000ADJUNTAS PR153YAUCO N 0446006020000AGUADILLA PR005AGUADILLA N 0446006020000AGUADA PR005AGUADILLA N 0446006020000AGUADA PR003AGUADA N 0446006030000AGUADILLA PR005AGUADILLA N 0446006040000AGUADILLA PR005AGUADILLA N <?php $filename = "c:\inetpub\wwwroot\text\zip.txt"; $logfile = fopen($filename, "r"); $hostname = "localhost"; $username = "xxxxxx"; $password = "xxxxxxx"; $dsn = 'mysql:host=localhost;dbname=zipdb $db = new PDO($dsn, $username, $password); print "<p>Login Successful </p>"; $i=0; while(!feof($logfile)) { $line = fgets($logfile); ## every 1000 lines, reset timer (I added this to stop timing out) $i++; if ($i > 1000) { ini_set('max_execution_time', 600); $i=1; } $f_office = substr($line,0,4); $zip_code = substr($line,4,5); $zip_plus = substr($line,9,4); $city_name = substr($line,13,17); $state_code = substr($line,30,2); $county_code = substr($line,32,3); $county_name = substr($line,35,15); $l_option = substr($line,50,1); $query = "INSERT INTO zipdb (f_office, zip_code, zip_plus, city_name, state_code, county_code, county_name, l_option) VALUES ('$f_office', '$zip_code', '$zip_plus', '$city_name' , '$state_code' , '$county_code' , '$county_name' , '$l_option')"; $db->exec($query); ## print "$i"."...<br>"; (below ## for debugging, then commented out) ## print "$line"; ## print "$f_office <br>" ; ## echo "$zip_code <br>" ; ## echo "$zip_plus <br>" ; ## echo "$city_name <br>" ; ## echo "$state_code <br>" ; ## echo "$county_code <br>" ; ## echo "$county_name <br>" ; ## echo "$l_option <br>"; ## echo "<br>"; } ini_set('max_execution_time', '550'); if (fclose($logfile)) print "<br> File closed </br>"; ?>; Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406423 Share on other sites More sharing options...
phpjim Posted January 17, 2013 Author Share Posted January 17, 2013 I also tried this way with the same results $lines = file($filename); foreach ($lines as $line_num => $line) { $f_office = substr($line,0,4); $zip_code = substr($line,4,5); $zip_plus = substr($line,9,4); $city_name = substr($line,13,17); $state_code = substr($line,30,2); $county_code = substr($line,32,3); $county_name = substr($line,35,15); $l_option = substr($line,50,1); $query = "INSERT INTO zip (f_office, zip_code, zip_plus, city_name, state_code, county_code, county_name, l_option) VALUES ('$f_office', '$zip_code', '$zip_plus', '$city_name' , '$state_code' , '$county_code' , '$county_name' , '$l_option')"; $db->exec($query); } Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406425 Share on other sites More sharing options...
Christian F. Posted January 17, 2013 Share Posted January 17, 2013 (edited) Please use the [code][/code] tags around your code, as it helps make both your post and your code a lot easier to read. Thanks. That said, your main problem is that you're running one query per line. Instead of this you should do whatKicken outline above: Build up the query to a multi-insert, and then run it once. That'll cut down the time a lot. Well, in this case, with so many lines. You might want to check the length as well, to cut it up to smaller segments. If the query becomes too big, it might be refused by the DB server and/or driver. Keep it below 1 MB, and you should be save. (strlen () can be used to get the size, since it assumes all characters are one byte.) Also, using file_get_contents () instead of fopen (), fgets (), fclose () and the loop might help shave some extra time off. Edited January 17, 2013 by Christian F. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406426 Share on other sites More sharing options...
phpjim Posted January 17, 2013 Author Share Posted January 17, 2013 I forgot to say that I did disable the indexes and it improved a little, but not much It was refreshing to hear there was something wrong with my programming and it's not MYSQL. For a moment I was thinking how did super fast Visual Foxpro die with the slow MYSQL speeds I was experiencing. I have not run run this at my hosting website to save on bandwidth and have been working locally on my notebook using XP, which is the same notebook I can do this in Foxpro in 10 seconds. I am not plugging Foxpro - it's what I know and the only db software I can compare my results to. Thanks. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406427 Share on other sites More sharing options...
phpjim Posted January 17, 2013 Author Share Posted January 17, 2013 Thanks Christian. This is my first post and will use the in the future. Your time clock says it's almost 1 in the afternoon where you are, which is 6 hours ahead of me. Are you in Europe? Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406429 Share on other sites More sharing options...
Christian F. Posted January 17, 2013 Share Posted January 17, 2013 FoxPro would probably be this slow too, if you ran one query per line. Only difference is that you know FoxPro enough to avoid this from happening. So it's only a question of experience, which is easy enough to gain. Yes, I'm in Europe. More specifically, Norway, as stated in my profile over <- there. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406432 Share on other sites More sharing options...
phpjim Posted January 17, 2013 Author Share Posted January 17, 2013 Thanks Christian for the help. Time to go dig in and figure this out Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406433 Share on other sites More sharing options...
Muddy_Funster Posted January 17, 2013 Share Posted January 17, 2013 If that is indead a tab delimited file, I still think using the LOAD DATA IN FILE syntax within the MySQL database is the best way to do this. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406434 Share on other sites More sharing options...
PFMaBiSmAd Posted January 17, 2013 Share Posted January 17, 2013 I would do the following (since you can read the whole file into an array using file)- 1) Write a function that takes a (one) line and converts it to the data string that would go in the VALUES (....) portion of the query statement. 2) Use array_map to apply the function from item #1 to the array of lines, replacing the same array of lines so as to not consume more memory then needed. 3) Use array_chunk to split the array of lines into an array of arrays of lines (5k-10k in size for each chunk), replacing the same array of lines so as to not consume more memory then needed. 4) Iterate over the resulting array of chunks and implode each sub-array to make and execute a multi-value insert query from that sub-array. Only one loop will be used and let php's array functions do most of the work. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406435 Share on other sites More sharing options...
PFMaBiSmAd Posted January 17, 2013 Share Posted January 17, 2013 Building on what Muddy_Funster posted, you could use steps #1 and #2 in what I suggested, but produce tab delineated data that you write to a file, then use a LOAD DATA IN FILE query to get the data into the database. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406437 Share on other sites More sharing options...
Christian F. Posted January 17, 2013 Share Posted January 17, 2013 Small correction to my post above: I do, of course, recommend using file () as well for reading the file. As it'll give you an array straight away. file_get_contents () was just force of habit. Sorry. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406438 Share on other sites More sharing options...
PFMaBiSmAd Posted January 17, 2013 Share Posted January 17, 2013 The following takes ~ 2.3/2.4 seconds on my development system for 61000 rows made up of repeated blocks of the sample data you posted - <?php // database connection here... $filename = "zip.txt"; $block_size = 10000; // how many lines of data per multi-query $lines = file($filename,FILE_IGNORE_NEW_LINES); function make_line($line){ $f_office = substr($line,0,4); $zip_code = substr($line,4,5); $zip_plus = substr($line,9,4); $city_name = substr($line,13,17); $state_code = substr($line,30,2); $county_code = substr($line,32,3); $county_name = substr($line,35,15); $l_option = substr($line,50,1); return "('$f_office','$zip_code','$zip_plus','$city_name','$state_code','$county_code','$county_name','$l_option')"; } $lines = array_map('make_line',$lines); $lines = array_chunk($lines ,$block_size); $i = 1; $start = microtime(true); foreach($lines as $chunk){ $query = "INSERT INTO zipdb (f_office,zip_code,zip_plus,city_name,state_code,county_code,county_name,l_option) VALUES " . implode(',',$chunk); // run query here... $result = $mysqli->query($query); echo "$i - ".count($chunk)." - ".strlen($query)." - "; var_dump($result); echo "<br />"; $i++; } $end = microtime(true); $diff = number_format($end - $start,2); echo "Done, in $diff seconds."; Some of your speed issues might be due to using PDO, because of what it does to make it general purpose, isn't going to result in the fastest execution. Also, using a prepared query for a repeated simple query like this doesn't help much (5%-10% time savings) because you must still transfer the data that goes into each query to the database server, which takes significantly longer than it takes the server to run the actual query. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406465 Share on other sites More sharing options...
phpjim Posted January 17, 2013 Author Share Posted January 17, 2013 PFMaBiSmAd - that was amazing. 31,000 records added in less than 2 seconds. Only half the number of records were added so I need to figure out why. Thank-you. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406473 Share on other sites More sharing options...
phpjim Posted January 17, 2013 Author Share Posted January 17, 2013 1 - 10000 - 770117 - int(10000) 2 - 10000 - 770117 - bool(false) 3 - 10000 - 770117 - int(10000) 4 - 10000 - 770117 - int(10000) 5 - 10000 - 770117 - bool(false) 6 - 10000 - 770117 - bool(false) 7 - 1097 - 84586 - int(1097) Done, in 1.56 seconds.; Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406474 Share on other sites More sharing options...
PFMaBiSmAd Posted January 17, 2013 Share Posted January 17, 2013 If you use the error reporting for the mysql/myqli/pdo you are using, it will tell you why the query failed. Most likely the data needs to be escaped. Quote Link to comment https://forums.phpfreaks.com/topic/273262-inserting-from-a-61000-record-text-file-does-1-12-hours-sound-right/#findComment-1406475 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.