Jump to content

importing a csv file with complications


Go to solution Solved by Ch0cu3r,

Recommended Posts

I have a csv file that i'm trying to import.  I've looked at a few tutorials, but so far, nothing really seems to help me with these issues.

 

Issue 2 - getting the start and stop date into the csv array.  I was simply merging the two, but ran into other problems due to how i was exploding the content.  I had this solved and working great using sscanf and a switch.

 

list($Y, $m) = sscanf($filen, '%s %s - %s');

    switch($m) { // i have 1 for each month, just not listed here, to long
        case 'Jan':
            $start = date('Y-m-d', strtotime($Y.'-01-01'));
            $end =  date('Y-m-d', strtotime($Y.'-01-31'));
            break;
    }

 

Issue 3 - changing string numbers to int numbers.  I have 2 that are saved like "4,855" and 1 that's saved liked $0.05.  The datatype for regular numbers are simple int, but the last one is float.  I had fixed those issues before by changing the strings to numbers using (int) and floatval(),

 

Issue 4 - doing the actual insert using 1 query.  I'm hoping there's a way to do just 1 query instead of first inserting the dates and then running an update query with the additional info.

 

The old script i had was exploding the contents on the , but thanks to how the file was saved, the numbers i mentioned above were messing things up horribly, and some one put tags in the title column for some entries.  So, doing it that way is out unless there's some way of selectively exploding content.  I had the other issues fixed too.

 

 

I'm hoping someone has some ideas on how i can get this thing working.

Link to comment
https://forums.phpfreaks.com/topic/288397-importing-a-csv-file-with-complications/
Share on other sites

No need for the switch for each month. For example

$filenames = array(
        '2014 Jan - yadayada.csv',
        '2014 Feb - yadayada.csv',
        '2014 Mar - yadayada.csv',
        '2014 Apr - yadayada.csv'
    );

printf ("%-30s | %-15s | %-15s\n\n", "Filename", "Start Date", "End Date");
foreach ($filenames as $filen) {    
    list($Y, $m) = sscanf($filen, '%s %s - %s');
    $start = date('Y-m-d', strtotime("01 $m $Y"));
    $end = date('Y-m-t', strtotime("01 $m $Y"));
    printf ("%-30s | %-15s | %-15s\n", $filen, $start, $end);

}

Gives

Filename                       | Start Date      | End Date       

2014 Jan - yadayada.csv        | 2014-01-01      | 2014-01-31     
2014 Feb - yadayada.csv        | 2014-02-01      | 2014-02-28     
2014 Mar - yadayada.csv        | 2014-03-01      | 2014-03-31     
2014 Apr - yadayada.csv        | 2014-04-01      | 2014-04-30 

As for your other problems, intval and floatval will only convert as far as the first non-numeric character

echo intval('4,567');    // ==> 4
echo floatval('$5.999'); // ==> 0

so you need to strip out the '$' and the commas (str_replace).

 

Could help more if I knew what your input looks like and what you are trying to insert into

 

Edited by Barand

I have to keep the start and end dates dynamic so i don't have to create an array.  I don't know that i'll be there another year and the person taking my spot may not know to go into the code and change file names in the array.  This way, i don't have to worry about it

 

This is what i'm using to get rid of the $

floatval(ereg_replace("[^-0-9\.]","", $arr[13]));

 

I know ereg is depreciated, but i don't know the preg_replace equivalent.  I tried str_replace and it didn't seem to work.

 

Sample inputs that would give this script issues

 

Stuart Street Timelapse 28/05/2013,http://www.youtube.com/watch?v=8WTelFJgNrQ,j2yanlol,,Limbo (7th Hour Instrumental)-Noah Becker,USA2P1265247,Marta Krupa,VG Group,"Dance,electric,pop,club,great,female,zedd,dea",0,2,2,Sound Recording,$0.01

 

NIGHT CLUB GROOV3 - DANCE FITNESS WORKOUT - WITH BENJAMIN ALLEN,http://www.youtube.com/watch?v=#NAME?,dancesweatlive,,Dance Sweat Live,USQY51161867,,VG Group,,30,"4,814","4,844",Composition,$6.44

 

It would have an issue with both of those because of what's inbetween the quotes.  In the first one, the person put tags where the video title is supposed to be and with the 2nd one, the column is formatted

 

The first one would get split up like (this is after i combined the dates and content arrays, the start and stop dates aren't included in the file contents).

 

Array
(
    [0] => 2014-01-01
    [1] => 2014-01-31
    [2] => '[Devpro Speed Duel] War Gods/Bujin vs Various Decks'
    [3] => 'http://www.youtube.com/watch?v=uZQDiQAlAzo'
    [4] => 'UnlimitedEchoe'
    [5] => ''
    [6] => 'Limbo (7th Hour Instrumental)-Noah Becker'
    [7] => 'USA2P1265247'
    [8] => 'Marta Krupa'
    [9] => 'VG Group'
    [10] => '"Dance'
    [11] => 'electric'
    [12] => 'pop'
    [13] => 'club'
    [14] => 'great'
    [15] => 0
    [16] => 'zedd'
    [17] => 'dea"'
    [18] => '0'
    [19] => '44'
    [20] => '44'
    [21] => 'Sound Recording'
    [22] => '$0.01 '
)

and the 2nd would be

Array
(
    [0] => 2014-01-01
    [1] => 2014-01-31
    [2] => 'NIGHT CLUB GROOV3 - DANCE FITNESS WORKOUT - WITH BENJAMIN ALLEN'
    [3] => 'http://www.youtube.com/watch?v=#NAME?'
    [4] => 'dancesweatlive'
    [5] => ''
    [6] => 'Dance Sweat Live'
    [7] => 'USQY51161867'
    [8] => ''
    [9] => 'VG Group'
    [10] => ''
    [11] => '30'
    [12] => '"4'
    [13] => '814"'
    [14] => '"4'
    [15] => 844
    [16] => 'Composition'
    [17] => '$6.44 '
)

 

In total, there's only supposed to be 15 keys

Edited by richei
  • Solution

 

This is what i'm using to get rid of the $

floatval(ereg_replace("[^-0-9\.]","", $arr[13]));

 

I know ereg is depreciated, but i don't know the preg_replace equivalent.  I tried str_replace and it didn't seem to work.

str_replace should work just fine, example

floatval(str_replace('$', '', $arr[13]));

 

 

It would have an issue with both of those because of what's inbetween the quotes.  In the first one, the person put tags where the video title is supposed to be and with the 2nd one, the column is formatted

If the data is coming from a csv file, then read the data using fgetcsv

ok, using that function, this is what i came up with

 

$row = 1;
if(($handle = fopen($report, "r")) !== FALSE) {
  while(($lines = fgetcsv($handle, 1000, ",")) !== FALSE) {
   $num = count($lines);
   $row++;
   for ($c=0; $c < $num; $c++) {
    $lines[$c] = trim(mysql_real_escape_string($lines[$c]));
    $lines[$c] = "'".$lines[$c]."'";
    $lines[10] = floatval(str_replace(',', '', $lines[10]));
    $lines[11] = floatval(str_replace(',', '', $lines[11]));
    $lines[13] = floatval(str_replace('$', '', $lines[13]));
   }
   $na[0] = $start;
   $na[1] = $end;

   $new = array_merge($na, $lines);
 
   $inst = "INSERT INTO {$companyTbl} (" . implode(', ', $sqlFields) . ") VALUES (".implode(", ", $new).")";
   echo $inst."<br>";
  }
  fclose($handle);
}

The only thing i need to figure out is how to ignore the 1 line.  There's a header row that i need don't need to import.  I tried changing $c to 1 and doing $num-1 in the for loop, but that didn't work.

try

$filen = '2014 Jan - yadayada.csv';

list($Y, $m) = sscanf($filen, '%s %s - %s');
$start = date('Y-m-d', strtotime("01 $m $Y"));
$end = date('Y-m-t', strtotime("01 $m $Y"));

$fp = fopen($filen,'r');
$data = array();
$rec = fgetcsv($fp, 1024); // read and ignore header line
while ($rec = fgetcsv($fp, 1024)) {
    array_unshift($rec, $start, $end);
    $rec[11] = filter_var($rec[11], FILTER_SANITIZE_NUMBER_INT);
    $rec[12] = filter_var($rec[12], FILTER_SANITIZE_NUMBER_INT);
    $rec[14] = filter_var($rec[14], FILTER_SANITIZE_NUMBER_FLOAT,FILTER_FLAG_ALLOW_FRACTION);
    $data[] = $rec;
}
echo '<pre>',print_r($data, true),'</pre>';

Gives

Array
(
    [0] => Array
        (
            [0] => 2014-01-01
            [1] => 2014-01-31
            [2] => Stuart Street Timelapse 28/05/2013
            [3] => http://www.youtube.c...j2yanlol
            [4] => 
            [5] => Limbo (7th Hour Instrumental)-Noah Becker
            [6] => USA2P1265247
            [7] => Marta Krupa
            [8] => VG Group
            [9] => Dance,electric,pop,club,great,female,zedd,dea
            [10] => 0
            [11] => 2
            [12] => 2
            [13] => Sound Recording
            [14] => 0.01
        )

    [1] => Array
        (
            [0] => 2014-01-01
            [1] => 2014-01-31
            [2] => NIGHT CLUB GROOV3 - DANCE FITNESS WORKOUT - WITH BENJAMIN ALLEN
            [3] => http://www.youtube.c...weatlive
            [4] => 
            [5] => Dance Sweat Live
            [6] => USQY51161867
            [7] => 
            [8] => VG Group
            [9] => 
            [10] => 30
            [11] => 4814
            [12] => 4844
            [13] => Composition
            [14] => 6.44
        )

)

those indexes aren't correct though, that's being based off of the combined array.  I got it fixed any working though.  I didn't actually have to tell it what the numbers were, just had to remove the quotes, which i accomplished with an if statement.

 

for ($c=0; $c < $num; $c++) {
    $lines[$c] = trim(mysql_real_escape_string($lines[$c]));
    if(!is_numeric($lines[$c])) {
     $lines[$c] = "'".$lines[$c]."'";
    }
    $lines[10] = str_replace(',', '', $lines[10]);
    $lines[11] = str_replace(',', '', $lines[11]);
    $lines[13] = str_replace('$', '', $lines[13]);
   }

This script is completely working now.  Thanks for all the help :)

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.