bmmayer Posted July 20, 2007 Share Posted July 20, 2007 hey all-- i am trying to parse a CSV file, which contains elements that have line breaks as well as quotes. Right now, i am using this function: function file_breakdown($content,$del){ ini_set('auto_detect_line_endings','1'); $file = fopen($content, "r"); $row = 1; while (($data = fgetcsv($file, 0, $del)) !== FALSE) { //loop content here } fclose($file); } the problem i'm having is that the function works fine when the cells that are being imported are normal--they contain no line breaks or quotes---but when it reaches a cell that has a line break, it creates a new "row" instead of maintains the current row. the quotes within the cells further confuse the program. in addition, i cannot designate the line breaks as being proceeded by a double quote (") because some quotes are typed inside the cells that are followed by line breaks. i need help telling the program to distinguish between the quotes that separate columns and quotes that are contained within the cells! and that doesn't think a line break within a cell means a new row. can anyone help me write a function that will take a csv file and return an ACCURATE array, with each value of the array being a $del-delimited line, like the information returned for the file() function? this has been driving me crazy. thanks for your help, -b Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/ Share on other sites More sharing options...
jorgep Posted July 20, 2007 Share Posted July 20, 2007 Can you post an example with part of the csv file? Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303737 Share on other sites More sharing options...
bmmayer Posted July 20, 2007 Author Share Posted July 20, 2007 i can't post exact data for privacy reasons. this is the type of data: Column 1,Column 2,Column 3,Column 4 Data 1,Data 2,Data 3,"This is a text field into which has been entered lots of data, including line breaks: like this. Also, there are quotes like this: "This is a quote, quote, quote." Notice how the quotation mark proceeded a line break? This is the end of column 4." Data 6,Data 7,Data 8,"Data Data Data" ...and so on Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303743 Share on other sites More sharing options...
jorgep Posted July 20, 2007 Share Posted July 20, 2007 shit thats messed up hehehe, lets see... the only way I see a solution is using ereg or something similar... Let me think about it and write you later. Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303748 Share on other sites More sharing options...
bmmayer Posted July 20, 2007 Author Share Posted July 20, 2007 thank you, i look forward to your reply -b Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303753 Share on other sites More sharing options...
ss32 Posted July 21, 2007 Share Posted July 21, 2007 its kinda simple really... this is assuming, though, that it is a true CSV and the column width is static, and you are reading it byte-by-byte. first off, you are going to want a variable, $word, which stores the current 'word chunk' that you are parsing. what you are going to want to do is set a flagger variable, $quoted. when you hit a quotation, you not-value this variable. this will let you know whether or not you are inside or outside of a quote. next, you are going to want a counter that counts the commas read. when you hit a comma, store the current word chunk to your array and increment the $commas variable. when the $commas variable exceeds the limit, reset it and increment your $rows variable. thats just the idea... its too long for me to type out here... you should be able to figure it out. ...or you could wait for the regular expressions response to show up, which would save you the time of doing this. (though literally, you could bang your head on the keyboard and something would work with regex....) Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303814 Share on other sites More sharing options...
clearstatcache Posted July 21, 2007 Share Posted July 21, 2007 are u going to return the data for column 4 exactly as how it is written? as for ur example like ds? -->> "This is a text field into which has been entered lots of data, including line breaks: like this. Also, there are quotes like this: "This is a quote, quote, quote." Notice how the quotation mark proceeded a line break? This is the end of column 4." Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303822 Share on other sites More sharing options...
bmmayer Posted July 21, 2007 Author Share Posted July 21, 2007 are u going to return the data for column 4 exactly as how it is written? as for ur example like ds? -->> "This is a text field into which has been entered lots of data, including line breaks: like this. Also, there are quotes like this: "This is a quote, quote, quote." Notice how the quotation mark proceeded a line break? This is the end of column 4." yes, that's correct. as for ss32: its kinda simple really... this is assuming, though, that it is a true CSV and the column width is static, and you are reading it byte-by-byte. first off, you are going to want a variable, $word, which stores the current 'word chunk' that you are parsing. what you are going to want to do is set a flagger variable, $quoted. when you hit a quotation, you not-value this variable. this will let you know whether or not you are inside or outside of a quote. next, you are going to want a counter that counts the commas read. when you hit a comma, store the current word chunk to your array and increment the $commas variable. when the $commas variable exceeds the limit, reset it and increment your $rows variable. thats just the idea... its too long for me to type out here... you should be able to figure it out. ...or you could wait for the regular expressions response to show up, which would save you the time of doing this. (though literally, you could bang your head on the keyboard and something would work with regex....) it would be great if you could write something out for me; i have tried something like this and it didn't really work. thanks a lot, -b Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303925 Share on other sites More sharing options...
keeB Posted July 21, 2007 Share Posted July 21, 2007 Ill write it for $20 Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-303937 Share on other sites More sharing options...
bmmayer Posted July 21, 2007 Author Share Posted July 21, 2007 free would be better, thanks Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-304221 Share on other sites More sharing options...
Barand Posted July 21, 2007 Share Posted July 21, 2007 Why are you creating a CSV file where the cell data contains same characters that are used as field and row delimiters in the first place? Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-304263 Share on other sites More sharing options...
keeB Posted July 21, 2007 Share Posted July 21, 2007 I hope you realize no one is going to write a CSV parser for you for nothing. We're all happy to help if you have specific problems, but if you want an implementation piece to be done from scratch, be willing to pay for it. Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-304281 Share on other sites More sharing options...
ss32 Posted July 21, 2007 Share Posted July 21, 2007 this is rough, though it should work. there may be a couple bugs, and the limitation is that every value must have a comma after it, newlines dont mean a new line in the CSV. so i guess it is not truly a csv parser, but you can modify it for your needs. <?php function csvToArray($csvFile, $linelen) { if (($contents = file_get_contents($csvFile)) === false) { return false; } $result = array(); $tarray = array(); $quoted = false; $word = ""; for($i = 0; $i < strlen($contents); $i++) { //get the current character $char = substr($contents, $i, 1); //var_dump($quoted); echo count($tarray) . "\r\n"; //check for the start/end of a quoted section if ($char == '"') { $quoted = !$quoted; } //if we are not in quote mode... if ($quoted == false) { //check for commas if ($char == ',') { //print_r($tarray); $tarray[] = $word; $word = ""; //now if we are over the limit of $linelen, then add the current temporary array to the result if (count($tarray) >= $linelen) { $result[] = $tarray; $tarray = array(); //reset the temporary array } } } if ($char != '"') { if ($char != ',' || $quoted) { $word .= $char; } } } return $result; } ?> Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-304336 Share on other sites More sharing options...
ss32 Posted July 22, 2007 Share Posted July 22, 2007 although, you could try a different approach. (where is my edit button?!) when you store the data, store the line break character as a different character, and then translate it when you read it. that way, it is a true CSV file, and you dont need a complex script to parse it. Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-304475 Share on other sites More sharing options...
sasa Posted July 23, 2007 Share Posted July 23, 2007 try <?php function csvToArray($csvFile, $linelen) { if (($contents = $csvFile) === false) { return false; } $fi_co = 0; $result = array(); $tarray = array(); while ($contents){ $word = ""; $delim = (++$fi_co % $linelen) ? ',' : "\n"; $pos = -1; do { if(($pos = strpos($contents, $delim, ++$pos)) === false) $pos = strlen($contents); $word = substr($contents, 0, $pos); $x = substr_count($word, '"') % 2; $pos; } while ($x) ; if (($fi_co % $linelen) == 1) $tarray = array($word); else $tarray[] = $word; if ($fi_co % $linelen == 0) $result[] = $tarray; $contents = substr($contents, $pos+1); } if ($fi_co % $linelen != 0) $result[] = $tarray; return $result; } // parse CSV file with line breaks and quotes $a = 'Column 1,Column 2,Column 3,Column 4 Data 1,Data 2,Data 3,"This is a text field into which has been entered lots of data, including line breaks: like this. Also, there are quotes like this: "This is a quote, quote, quote." Notice how the quotation mark proceeded a line break? This is the end of column 4." Data 6,Data 7,"Data "sasa" 8","Data Data Data"'; $m = csvToArray($a,4); print_r($m); ?> Quote Link to comment https://forums.phpfreaks.com/topic/61036-parse-csv-file-with-line-breaks-and-quotes/#findComment-305321 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.