Jump to content

Importing large excel file into MySQL


macattack

Recommended Posts

I've looked around a bit for some info on this, but I've yet to find anything solid.

 

What I have is an excel file with ~14,000 rows. The columns are as follows: ID, Poll, Last name, first name, middle name, unit, st no, street name, type, community, postal, phone 1, sign, volunteer. ID is blank in every row. On the old version of the file, there is a record under phone number. Every week, I am delivered a new copy of the sheet, with rows inserted randomly. What I need to do is take this file and import the changes into a MySQL database. The MySQL table has a few more rows. I'm wondering what the best way to go about this is.

 

My thoughts are that I export it as a .csv file to start. But how do I import only the changes into the MySQL database? It has to check each row to see if the info is already there (and a 100% match). If the name is there but the address is different, that has to be changed, if no info is there, the entire record has to be added. The important thing is that this gets done without altering any other data in the MySQL table, because other identifying information about each person is in the table, but not in the excel sheets that I get.

 

If someone can give me a hand with this it would be greatly appreciated. I'd prefer to not have to check 14,000 records by hand every week.

Link to comment
Share on other sites

I'd definitely export to CSV file first with that amount of data.

 

PHP has some nice functions to parse CSV files (fgetcsv).

 

If no data is ever missing from an updated Excel document you can go the easy way and truncate (empty) the table before importing then just import everything from scratch.

 

If some data is missing and you need to keep that then you'll have to write a routine to examine each piece of data line by line. The easiest way I can think of is to have an extra field in the MySQL database called checksum - have this as a 32 byte VARCHAR and store the MD5 hash of the original line from the Excel document in this field. Instead of checking loads and loads of data you only need to match something unique to each line (line an ID number) with the MD5 hash - if the two match you can ignore it and check the next line.

 

If there's no match but the ID exists then perform an UPDATE. If there's no match and ID doesn't exist then you need to perform an INSERT.

 

This is how I'd handle this although I'd be VERY tempted to write my own CSV parsing routine as I've encountered a bug with fgetcsv I've not found a solution for.

Link to comment
Share on other sites

I have no idea why ID is blank. I don't make the Excel file. It comes from another organization. The ID shouldn't be blank, so I'm guessing the current version is just missing the ID for whatever reason.

 

I can't dump the data on the table for that very reason – without an ID, I can't handle data in a separate table. I have to check the data and import any changes. That's the part that I don't know how to do. If the ID column wasn't blank, it'd be a walk in the park.

Link to comment
Share on other sites

The excel file has the first, middle and last names. The likelihood of two names matching up is pretty low. It also has addresses, but those are subject to change. The rows are listed in my first post. ID, Phone 1, Volunteer and Sign are all blank.

Link to comment
Share on other sites

I saw the columns listed but not knowing the content of the data I've not idea what is what.

 

The only thing I can suggest is go back to the source of the data and ask why the ID field is empty and ask if it can be populated with something to make things a little easier.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.