sixlead Posted November 26, 2008 Share Posted November 26, 2008 I have to import a csv change its structure and re-export it. csv in example - user1,2005 user1,2006 user2,2005 user2,2006 user2,2007 export structure user1,2005,2006 user2,2005,2006,2007 I have thought about using a loop while reading in the file, matching each unique name value in turn and copying data into a new csv However I am not certain that this is the best answer especially given the size of the csv import. I am looking forward your comments TIA Quote Link to comment https://forums.phpfreaks.com/topic/134403-to-use-a-db-a-file-or-an-array-that-is-the-question/ Share on other sites More sharing options...
zq29 Posted November 26, 2008 Share Posted November 26, 2008 I'd go with arrays, only because it is less code... Quote Link to comment https://forums.phpfreaks.com/topic/134403-to-use-a-db-a-file-or-an-array-that-is-the-question/#findComment-699706 Share on other sites More sharing options...
sixlead Posted November 27, 2008 Author Share Posted November 27, 2008 There may be less code involved in using the arrays, but what about the performance of the process. the csv file contains 15,000 + records with 25 fields. of course I think this is a large file to process, but that may be a mis-conception on my part. Quote Link to comment https://forums.phpfreaks.com/topic/134403-to-use-a-db-a-file-or-an-array-that-is-the-question/#findComment-700013 Share on other sites More sharing options...
.josh Posted November 27, 2008 Share Posted November 27, 2008 Okay so I was bored. Here's what I came up with. Idea is to loop through source and destination file 1 line at a time, instead of making a giant array. <?php // open source file for reading $sourceFile = fopen("data.txt", "r"); // while not end of source file while (!feof($sourceFile)) { // get the current line and explode it into an array $sourceLine = explode(',', fgets($sourceFile, 4096)); // set found to false because we haven't found a username match yet $found = false; // open destination file for reading and writing $destFile = fopen("data2.txt", "r+"); // while not end of destination file while (!feof($destFile)) { // find out what byte the file pointer is currently at (so we know what line to change) $lineLoc = ftell($destFile); // trim the current line (to get rid of the \n) and explode into array $destLine = explode(',', trim(fgets($destFile, 4096))); // check to see if the current user from source is same as // the current user from destination. If it's a match.... if ($sourceLine[0] == $destLine[0]) { // remove the username from the source array array_shift($sourceLine); // append source info to the destination array and implode to a string $newLine = implode(',', array_merge($destLine, $sourceLine)); // move file pointer to the right line (because fgets moved it forward to the next line) fseek($destFile, $lineLoc); // overwrite the old line with the new line fwrite($destFile,$newLine); // set found to true $found = true; // since match was found, no need to keep checking lines, so break out of loop break; } // end if match found } // end while destFile // if we did not find a match... if ($found == false) { // move pointer to end of the file fseek($destFile, 0, SEEK_END); // add the source line to end of file as new entry fwrite($destFile, implode(',',$sourceLine)); } // end if !found // close the destination file fclose($destFile); } // end while sourceFile // close source file fclose($sourceFile); ?> Okay so I tested it out. It seems to from formatA to formatB just like you wanted...except for one small bug. In order for this code to work, the usernames must be sorted. In your example, they are sorted. But if you have for instance, this: user1,2006 user2,2005 user2,2006 user2,2007 user1,2005 for some reason it bugs and ends up doing this: user1,2006, 2005 ,2005,2006,2007 another example, this: user1,2005 user1,2006 user2,2005 user2,2006 user2,2007 user3,2001 user3,2000 user3,1999 user2,1111 turns into this: user1,2005,2006 user2,2005,2006,2007,1111 ,2001,2000,1999 I'm not entirely sure why. Maybe someone else can spot the bug, or find a better solution; I'm tired, going to bed. Quote Link to comment https://forums.phpfreaks.com/topic/134403-to-use-a-db-a-file-or-an-array-that-is-the-question/#findComment-700090 Share on other sites More sharing options...
.josh Posted November 27, 2008 Share Posted November 27, 2008 Okay so I found the bug. I was fairly certain of what it was last night. Just too tired to confirm it. Problem is that when you go back to a previous user, and add more to it, it overwrites part of the next entry, since the next entry starts at the byte position after the previous one. One way to fix this is to create a fixed line length by padding the line with spaces. Basically same principle as like in a database char type column where if you have for example char(20) and the data is 10 chars, it will pad it with spaces to make it 20 chars. So for instance if you want each line to be a total of 50 chars long, you would just do 50-strlen(data to insert) to find out how many spaces to pad the data with. Problem with this method is if you're unsure how long each user's line can get. If it can get longer than the max length, then we're back to square one. Another way is to make use of a 3rd, temporary file. Basically the idea would be that every time there is a match on user (not a new entry), you would start a loop that loops through each line of the destination file and writes each line to a new file (basically copying destFile to tempFile, line by line. Throw in a condition to write the altered line in at the right time, finish writing the rest of the lines, save the temp file as destFile and delete the tempfile. wash, rinse and repeat. Advantage of this is that you don't have to worry about how long each user's line will get, because each line would be written to the end of the file. Disadvantage of it is that it's throwing another file and another loop into the mix. More resources, longer processing time, etc.. but..you can always break the process down to a couple hundred lines at a time, to avoid possible timing out. Anyways, if your source files are already sorted by user, the already posted code should work fine. If it is not, then you can consider adding onto the posted code, with one of those 2 options. Or maybe you or someone else can think of a better option. Lord knows, I do have a tendency to over complicate things. Or...another option that involves using a database: Make a table with 2 columns, one for user, another for value. Read the file line by line, exploding the line to have user/val separated. Insert the line into the table with user in user column, val in value column. Then you can select the rows ordered by user, run a loop that builds the string until user changes, fwrites the line, wash rinse and repeat. Or hell, it may very likely be possible to write a sql statement that will format the output for you. Maybe doing an inner join and a substringed concat or something. Not sure though. I'm not a sql expert. Quote Link to comment https://forums.phpfreaks.com/topic/134403-to-use-a-db-a-file-or-an-array-that-is-the-question/#findComment-700393 Share on other sites More sharing options...
sixlead Posted November 29, 2008 Author Share Posted November 29, 2008 WOW That is so much more than I was looking for, thank you very much. I was really just looking for a discussion on the best method to employ but you went an extra mile. I will try your code on the actual file and see how it does. As to getting the file sorted that can be done during the data export by using an ORDER BY clause in the select statement. Thanks Again Quote Link to comment https://forums.phpfreaks.com/topic/134403-to-use-a-db-a-file-or-an-array-that-is-the-question/#findComment-701702 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.