Jump to content

Excel Data to MySQL


dweb

Recommended Posts

Hi everyone

 

I wonder if you can help me.

 

Excuse me if this is in the wrong section, but it's kind of a mixed question.

 

My dad runs an online shop and I am taking care of the website (while he deals with orders), currently i'm trying to enhance it with a simple product search

 

We received our latest database of products in an excel file, in total there are around 300,000 rows of data in the excel file, attached to this post is an image showing you how the data is presented to us, we're expecting another database soon, which will have around 600,000 rows, and maybe another 400,000 next year.

 

What I need to do is get the data into a MySQL database and then write a small PHP page to which displays drop downs for

 

Company

Type

Range

 

and then list products associated to the selected options, along with the description (and a few extra fields ie: price, color).

 

I may need help with the PHP script, but will post that in another tread once I have managed to deal with this issue.

 

I guess i'll need to import all this data into separate tables (company, type, range, products) so to make sure it's not slow at getting data, as I said, we have 300,000 rows of data and we're getting another 600,000 rows later, I guess with almost 1 million records it would be to much to hold in one table, especially if we get the 400,000 next year.

 

my problem is, that i'm stuck with how to get all this data from the excel file into the a MySQL database with relevant ID numbers linking data.

 

We have asked the company that supplies the data, but they can only supply it in an excel file and we're at a bit of a panic point at the moment.

 

can anyone help?

 

thank you

post-133714-13482403776686_thumb.jpg

Link to comment
Share on other sites

I guess with almost 1 million records it would be to much to hold in one table, especially if we get the 400,000 next year.

No: that's perfectly fine. Don't underestimate how well database systems can deal with large amounts of (properly indexed and normalized) data.

 

When I've done work like this I've ended up using cursors. Export the spreadsheet as a CSV and import that into a temporary table, then loop through a SELECT and do whatever work you need. You'll have to clean up the spreadsheet first so it looks like a proper table: duplicate values across rows and whatnot.

 

If you're lucky you might be able to INSERT into a view that pieces together the right tables, but it depends on stuff that I haven't looked into too much.

 

[edit] Of course, as Psycho mentioned you can do this in PHP just as well. Surprised that wasn't my first thought...

Link to comment
Share on other sites

EDIT: Requinix beat me to it, but including this anyway as there is additional info.

 

I think you need to step back a moment. There is nothing wrong with a table with a million records in it. You should not break up data into separate tables because you think it might be "too much". You need to focus on creating a good database schema and properly indexing the tables to get appropriate performance.

 

You will need to create a PHP script to read the data line by line and process the data appropriately. But, processing that many records could poise a timeout issue.

 

Anyway, looking at the data, here is what I think you would need for the DB:

 

1. A table for companies. It should have two fields for "company_id" and "company_name"

2. A table for type. It should have two fields for type_id (primary) and type_name

3. A table for range. It should have two fields for range_id (primary) and "range_name" (not too sure about this one as I don't fully understand it)

4. Lastly a table for products which would have fields as follows: prod_id (primary), prod_name, prod_details, company_id, type_id, range_id

 

Now, when you process the data you will need to take several steps to find and enter new companies, types and ranges before entering the products. because you need to get the ids of those values for the products table. So, map out the process on paper before you start coding anything.

All of the primary keys in the tables should be auto-increment fields. The ID fields at the end of the products table will be foreign keys.

Link to comment
Share on other sites

OK, I like these types of problems, so I decided to write some code. This is by no means optimized code but it works. So, you will want to find where you can make it more efficient if you are going to run lost of products. The code will not create duplicates of the companies, types or ranges. But, I don't see anything that would be unique for the actual products. If there is a product ID that you are not sharing you could use that as a unique field in the DB. Or you could set (company_id + type_id + range_id + product) as a unique combination in the product table.

 

I used a Excel sheet in the exact same format you you showed above. You must save the file as a CSV file before processing it with the code. I also provided the table structures as they require that you set up the company, type and range fields to be unique for this code to work.

 

Code

mysql_connect('localhost', 'root', '') or die(mysql_error());
mysql_select_db('products') or die(mysql_error());

//Open the file
if (($handle = fopen("test.csv", "r")) !== FALSE)
{
    //Get line 1 and do nothing with it
    $line1 = fgetcsv($handle, 1000, ",");

    //Create flag vars
    $current_company = false;
    $current_type    = false;
    $current_range   = false;

    //Create arrays to hold data
    $companies = array();
    $types = array();
    $ranges = array();
    $products = array();

    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
    {
        //trim and escape the data
        $data = array_map('trim', $data);
        $data = array_map('mysql_real_escape_string', $data);

        //See if a new company exists
        if($data[0] != '')
        {
            $current_company = $data[0];
            if(!in_array($current_company, $companies)) { $companies[] = $current_company; }
        }
        //See if a new type exists
        if($data[1] != '')
        {
            $current_type = $data[1];
            if(!in_array($current_type, $types)) { $types[] = $data[1]; }
        }
        //See if a new range exists
        if($data[2] != '')
        {
            $current_range = $data[2];
            if(!in_array($current_range, $ranges)) { $ranges[] = $current_range; }
        }
        //See if this is a product
        if($data[3] != '')
        {
            $products[] = array(
                'product' => $data[3],
                'details' => $data[4],
                'company' => $current_company,
                'type'    => $current_type,
                'range'   => $current_range
            );
        }
    }

    fclose($handle);
}

//Insert the new companies
$query = "INSERT IGNORE INTO companies (company) VALUES ('" . implode("'), ('", $companies) . "')";
$result = mysql_query($query) or die(mysql_error());
//Get list of all companies with their IDs and put into array
$query = "SELECT company_id, company FROM companies";
$result = mysql_query($query) or die(mysql_error());
$companies = array();
while($row = mysql_fetch_assoc($result))
{
    $companies[$row['company']] = $row['company_id'];
}

//Insert the new types then get list of all companies for the IDs
$query = "INSERT IGNORE INTO types (type) VALUES ('" . implode("'), ('", $types) . "')";
$result = mysql_query($query) or die(mysql_error());
//Get list of all types with their IDs and put into array
$query = "SELECT type_id, type FROM types";
$result = mysql_query($query) or die(mysql_error());
$types = array();
while($row = mysql_fetch_assoc($result))
{
    $types[$row['type']] = $row['type_id'];
}

//Insert the new companies then get list of all companies for the IDs
$query = "INSERT IGNORE INTO ranges (range) VALUES ('" . implode("'), ('", $ranges) . "')";
$result = mysql_query($query) or die(mysql_error());
//Get list of all ranges with their IDs and put into array
$query = "SELECT range_id, range FROM ranges";
$result = mysql_query($query) or die(mysql_error());
$ranges = array();
while($row = mysql_fetch_assoc($result))
{
    $ranges[$row['range']] = $row['range_id'];
}

//Process product data into insert values
$prodValues = array();
foreach($products as $product)
{
    //Set company, type and range to their respective IDs
    $companyID = $companies[$product['company']];
    $typeID = $types[$product['type']];
    $rangeID = $ranges[$product['range']];
    //Add single insert record to array
    $prodValues[] = "('{$product['product']}', '{$product['details']}', '{$companyID}', '{$typeID}', '{$rangeID}')";
}

//Create & run one insert query for all records
$query = "INSERT INTO products (product, details, company_id, type_id, range_id) VALUES " . implode(', ', $prodValues);
$result = mysql_query($query) or die(mysql_error());

 

Tables

-- 
-- Table structure for table `companies`
-- 

CREATE TABLE `companies` (
  `company_id` int(11) NOT NULL auto_increment,
  `company` varchar(50) collate latin1_general_ci NOT NULL,
  PRIMARY KEY  (`company_id`),
  UNIQUE KEY `company` (`company`)
) ENGINE=MyISAM AUTO_INCREMENT=19 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci AUTO_INCREMENT=19 ;

-- --------------------------------------------------------

-- 
-- Table structure for table `products`
-- 

CREATE TABLE `products` (
  `prod_id` int(11) NOT NULL auto_increment,
  `product` varchar(50) collate latin1_general_ci NOT NULL,
  `details` varchar(50) collate latin1_general_ci NOT NULL,
  `company_id` int(11) NOT NULL,
  `type_id` int(11) NOT NULL,
  `range_id` int(11) NOT NULL,
  PRIMARY KEY  (`prod_id`)
) ENGINE=MyISAM AUTO_INCREMENT=17 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci AUTO_INCREMENT=17 ;

-- --------------------------------------------------------

-- 
-- Table structure for table `ranges`
-- 

CREATE TABLE `ranges` (
  `range_id` int(11) NOT NULL auto_increment,
  `range` varchar(50) collate latin1_general_ci NOT NULL,
  PRIMARY KEY  (`range_id`),
  UNIQUE KEY `range` (`range`)
) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci AUTO_INCREMENT=4 ;

-- --------------------------------------------------------

-- 
-- Table structure for table `types`
-- 

CREATE TABLE `types` (
  `type_id` int(11) NOT NULL auto_increment,
  `type` varchar(50) collate latin1_general_ci NOT NULL,
  PRIMARY KEY  (`type_id`),
  UNIQUE KEY `type` (`type`)
) ENGINE=MyISAM AUTO_INCREMENT=5 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci AUTO_INCREMENT=5 ;

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.