Jump to content

Dodgy CSV to MySql


letmein

Recommended Posts

Hello,

Could someone please please help me with this, ive been wracking my poor brains on this for ages and I just cant do it...

My site basically relies on a product data feed from a company, this datafeed is a csv file which is on the other companies website, I need to download this unzip it and then import it into mysql database and I need to do this every 2 hours or my site and products are out of date, lol, the problem is the CSV files of which there are 2 main ones I need arent really formatted properly, so im currently having to

1, Download & unzip
2, Open in Excel, save as Excel work sheet
3, go to Control Panel, Languages, Change List Seperator from , to |
4, Open work sheet in excel
5, Remove first 2 rows, (contain rubbish)
5, Save as CSV
6, Go to PhpMyAdmin
7, Select DB, and Table.
8, Import CSV file, upload it, Seperator |

There u go its done every 2 hours, lol...

Anyway the problem with the files is most fields are seperated by , commas however some fields also contain commas in them but these fields are enclosed in " Quotes. Also on some rows where not all fields are used they are just left blank with no ending commas

eg Product file contains 100 fields
eg Product spec file contains 51 fields

Spec file - Fields: Model, S1, S2, S3, S4,...
Content
M1, s, s, s, s, s, s, s, s...
M2, r, t
M3, d, d, r, ddds, "dsd, sssfd", 43

So basically im trying to make a script that downloads the file from the website to my site, it then extracts the csv files from the Zip, then reads them and formats them add trailing ,,, if they fall short of the minimum for that csv (eg 100 or 51) then splits the line by commas, trying to ignore the ones within quotes, then add all that to mysql...

lol, I have managed to get it to download, then unzip the files, I can read the file but odd things keep happening with the reads and with the csv formatting I just cant get my head around the import to mysql part, it either doesnt read the file fully, even though I know the file is downloaded and extracted properly, eg it stops at 52 records when I know there are over 2,000 items, lol it then doesnt import to mysql or it imports one line then dies

Please please someone help im seeing commas flying around everywhere and im going nuts...



Link to comment
Share on other sites

Ok, know how a csv file has stuff like "this is field 1, i like peas","field2, ooo look there was another comma","filed3","ahhh, a comma!"  and so on?  If youll notice when its splitting the field it always has a "," so you can explode it by "," then strip the extra " off the first and last fields.

[code=php:0]
<?
$text = "\"This is, some text\",\"this is some more, text\",\"this is the last text\"";
//str_replace("\"", "", $text[0]);
//print_r(explode("\",\"", $text));
$text_a = explode("\",\"", $text);
$text_a[0] = str_replace("\"", "", $text_a[0]);
$text_a[count($text_a) - 1] = str_replace("\"", "", $text_a[count($text_a) - 1]);
print_r($text_a);

?>
[/code]

I dont know about the unzipping and all that stuff (maybe a scheduled cmd thingy and command line winrar or something?), but thatll split the cvs file once you have it in the right place and what not...
Link to comment
Share on other sites

Actually that sounds like a properly formatted CSV file to me. You can do something like so: (not tested!)

LOAD DATA INFILE 'data.csv' INTO TABLE my_table
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n';
IGNORE 2 LINES;

Look up the LOAD DATA INFILE commmand in the MySQL Manual. I think MySQL might even be able to handle gzipped files, you'll have to look it up. At the end of the day, I doubt you'll need to do any tinkering from the PHP end at all.
Link to comment
Share on other sites

lol, thanks all, I will look into the Load Data Infile thing, I dont know if it can go without formatting the csv file first though as some of the fields dont have the remaining ,'s at the end, so I would have to do something like a loop to add the missing number of commas, if the field count is lower than the required field count would I??? or would it work without having to add these please?

lol, corbin I did all the download and zip extraction part, im just stuck on the supposedly simpler csv reading part lol :D problem with your example is my datafeed is more like

$text = "\"This is, some text\",this is some more text,\"this is the last text\", I hate CSVs";
Link to comment
Share on other sites

Hmm just looked through some php functions (didnt know mysql had a function for doing this) and if things dont work out with the mysql approach maybe consider using
[code=php:0]
<?
$handle = fopen("book1.csv", "r+");
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
  $num = count($data);
  echo "<p> $num fields in line $row: <br /></p>\n";
  $row++;
  for ($c=0; $c < $num; $c++) {
      echo $data[$c] . "<br />\n";
  }
}
fclose($handle); //all this take from the php manual
?>
[code=php:0]

And with the book1.csv being
[code=php:0]
firstname,lastname,field3,csv sucks,
corbin,hi,"ooo, a comma,","o,o look, another, comma,",
fg,dfghdf,dfgh,fghd,ghdfghdf
"hd,fg,","df,ghdf,","fgh,df","dfg,hd","fg,h"
"dfg,hdfgh","dfgh,df","dfg,h","df,ghd","fgh,dfgh"
[/php

It would return
[code=php:0]
5 fields in line :


firstname
lastname
field3
csv sucks


5 fields in line 1:


corbin
hi
ooo, a comma,
o,o look, another, comma,


5 fields in line 2:


fg
dfghdf
dfgh
fghd
ghdfghdf

5 fields in line 3:


hd,fg,
df,ghdf,
fgh,df
dfg,hd
fg,h

5 fields in line 4:


dfg,hdfgh
dfgh,df
dfg,h
df,ghd
fgh,dfgh
[/code]
Link to comment
Share on other sites

Hi,

I used fgetcsv on my first attempt with this script, the problem was for some reason it stopped reading the csv half way through or after a certain number of records, I know the DB has over 2600 records, first csv contains 100 fields and the second contains 51 fields and 2000 records, both files are under 8 meg in size... I dont know why but it still just thought it was the end of the file after a few records...

I thought it was something to do with the 1000, so I tried increasing that too but it did the same...

Im trying to use load data infile, however I cant seem to get this working, it wont for me in a php script or even through phpmyadmin, I have chmod the file to make sure its readable and tried everything I could think of but I dont think its finding the file... have added the full file path and tried all sorts of different things but it still wont read, I have emailed the hosting company to see if they might be able to help just to make sure I am using the right path to the file...
Link to comment
Share on other sites

I tried the following as well

[code]

$fs = fopen( 'data/myfile.csv', 'r' );

while( ! feof( $fs ) )
{
    $tmp = fgets( $fs );
     
    $tmp = str_replace(", ", "comma; ", $tmp); // replace commas that are not the field end with the word comma
// assuming all other commas have a space after them...

    $tmp = str_replace(",\"", "|", $tmp); // replace all other commas with |
    $tmp = str_replace("\",", "|", $tmp);
    $tmp = str_replace(",", "|", $tmp);
    $tmp = str_replace("comma;", ",", $tmp); // put commas back

    $tmp = rtrim($tmp, " \n.");
    $tmp = rtrim($tmp, " \r."); 

    $test = split('|', $tmp);
   

    if (count($test) < $fields && trim($test[0]) != "MODEL"){
for ($counter = count($test); $counter < $fields; $counter += 1){
$tmp .= "|";
}
}
    if (count($test) > $fields){
echo "Error Here ModelNo: " . $test[0] . " Number of fields: " . count($test) . "<br>";
}

$test = split('|', $tmp); // Split again after formatting

if (trim($test[0]) != "MODEL"){
$query = "INSERT INTO deals VALUES ('". implode("|", $test) ."')";
$query = @mysql_query($query);
}

$counter = $counter + 1;
}

fclose( $fs );
[/code]
Link to comment
Share on other sites

ok cool I got it to work, made some mods to my original script found out the problem with it only reading a couple of records and then quiting was not because of the script it was because of the extraction, it doesnt display any error messages, half way through extraction i was running out of diskspace on my hosting account so it was only extracting half the file... lol all that trouble, increased space on account and deleted unused files and it worked :) some dumb things in the csv too which where preventing it e.g one of the records for some reason had \, at the end of one of its fields so adding it to mysql failed 'field1','field2\','field3', stripped \,

[code]
$fs = fopen( "data/mycsv.csv", 'r');
if ($fs) {
while( ! feof( $fs ) )
{
    $added = 0;
    $tmp = fgets( $fs );
    $tmp = str_replace(", ", "comma; ", $tmp); // replace commas that are not the field end with the word comma
// assuming all other commas have a space after them...
    $tmp = str_replace(",\"", "|", $tmp); // replace all other commas with |
    $tmp = str_replace("\",", "|", $tmp);
    $tmp = str_replace(",", "|", $tmp);
    $tmp = str_replace("\"|", "|", $tmp);
    $tmp = str_replace("comma;", ",", $tmp); // put commas back
    $tmp = rtrim($tmp, " \n.");
    $tmp = rtrim($tmp, " \r."); 
    $tmp = rtrim($tmp, "\"");
    $tmp = str_replace("\|", "|", $tmp);
    $tmp = str_replace("'", "\'", $tmp);
    $test = explode("|", $tmp);
   
    if (count($test) < $fields && trim($test[0]) != "ITEMCODE"){
for ($counter = count($test); $counter < $fields; $counter += 1){
$tmp .= "|";
}
}
    $test = explode("|", $tmp);
if (count($test) != $fields){
echo "Error Model No: " . $test[0] . " Number of fields: " . count($test) . "<br>";
}
       
$tmp = str_replace("|", "','", $tmp);

if (trim($test[0]) != "ITEMCODE" && $counter2 != "0"){
$query = "INSERT INTO mytable VALUES ('$tmp')";
$query = @mysql_query($query);
$added = 1;
}
if ($added == 0 && trim($test[0]) != "ITEMCODE" || $query == ""){
if ($counter2 != "0"){
$test = explode("','", $tmp);
echo "<br>ERROR: " . $test[0] . "field count: " . count($test) . "<br><br>";
for ($counter = 0; $counter < $fields; $counter += 1){
echo $counter . " - " . $test[$counter] . "<br>";
}
}
}

$counter2 = $counter2 + 1;
}
fclose( $fs );
}

mysql_close($conn);

[/code]

Now just need to figure out how to run cron jobs to get it to run the script automatically every 2 hours, I dont think I have access to wget on my account... :(
Link to comment
Share on other sites

[quote author=letmein link=topic=116535.msg475389#msg475389 date=1164756562]
lol, thanks all, I will look into the Load Data Infile thing, I dont know if it can go without formatting the csv file first though as some of the fields dont have the remaining ,'s at the end, so I would have to do something like a loop to add the missing number of commas, if the field count is lower than the required field count would I??? or would it work without having to add these please?
[/quote]

I'm pretty sure once it reaches an end of line, it'll go to the next row and just assume the rest of the fields are blank. Like I said, I suspect you don't have to do any PHP editing at all. I think you don't even have to unzip the file for it.
Link to comment
Share on other sites

I cant seem to load data infile, it doesnt even work with PhpMyadmin... Think it has something to do with the hosting plan, I have asked the hosting company but they havent commented I have tried all file paths I can think of and chmod the directory and files and it still wouldnt work, think the site hosting is on a different server to mysql???
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.