Jump to content


Photo

Line Delimiter


  • Please log in to reply
8 replies to this topic

#1 RabPHP

RabPHP
  • Members
  • PipPipPip
  • Advanced Member
  • 38 posts

Posted 27 October 2006 - 06:34 PM

Greetings,

I have a tough issue.  I use text files extensively to read data and then enter it into MySQL.  I use semi-colons for field delimination and then typically a new line between records.

Recently I found that I also need to add another field to one of the text files, but this field has hard returns in it.  This causes PHP to read each return in this field as a new line and thus a separate record

I believe I either need to figure out how to specify the record ending delimiter or use text qualifiers.  Does anyone have any information on either of these?

Rab

#2 Psycho

Psycho
  • Moderators
  • Move along, nothing to see here
  • 11,892 posts
  • LocationCanada

Posted 27 October 2006 - 06:41 PM

There are some easy solutions depending upon the type of data this is. If this is data to be displayed you could replace all the line breaks with \n or <br> before you save to the database.

Or, if that would not work for this data, you could still replace the returns with a "key" before inserting into the database and then replace the "key" with hard returns when you read the data back out from the database.
The quality of the responses received is directly proportional to the quality of the question asked.

I do not always test the code I provide, so there may be some syntax errors. In 99% of all cases I found the solution to your problem here: http://www.php.net

#3 RabPHP

RabPHP
  • Members
  • PipPipPip
  • Advanced Member
  • 38 posts

Posted 27 October 2006 - 08:22 PM

The problem is differentiating bewteeen a new line.  Here is some example data.

Joe;Smith;09/07/06;
10/15/06 - Some Entry
10/15/06 - another entry
10/16/06 - Third Entry Log;
Jane;Doe;09/08/08;
10/12/06 - Some Entry
10/12/06 - another entry;

After the first name and lastname there is date.  PHP sees this as the end of the record because of the new line even though the entries after them belong to the same record.  I cannot remove the returns in the last field, because PHP thinks they are new lines, not just a field.

Anyone have an example of how to parse text files with text qualifiers? For example..

"Joe";"Smith";"09/07/06;"
"10/15/06 - Some Entry
10/15/06 - another entry
10/16/06 - Third Entry Log";
"Jane";"Doe";"09/08/08";
"10/12/06 - Some Entry
10/12/06 - another entry";

Help appreciated.  Thanks

Rab

#4 effigy

effigy
  • Staff Alumni
  • Advanced Member
  • 3,600 posts
  • LocationIL

Posted 27 October 2006 - 08:54 PM

Are you using MySQL's LOAD DATA? If so, you can specify the surrounding text delimiters--the double quotes.
Regexp | Unicode Article | Letter Database
/\A(e)?((1)?ff(?:(?:ig)?y)?|f(?:ig)?)\z/

#5 RabPHP

RabPHP
  • Members
  • PipPipPip
  • Advanced Member
  • 38 posts

Posted 27 October 2006 - 09:24 PM

I am pulling the data from a text file and then updating the appropriate records in MySQL.

#6 Psycho

Psycho
  • Moderators
  • Move along, nothing to see here
  • 11,892 posts
  • LocationCanada

Posted 27 October 2006 - 10:59 PM

Using you example data above, this should work:

<?php
$fileAry = file('file.txt');

$fname = "";
$lname = "";
$date = "";
$entries = "";

foreach ($fileAry as $line) {
  if ($fname=="") {
    //This is the first line of a new record
    $pieces = explode(";", $line);
    $fname = $pieces[0];
    $lname = $pieces[1];
    $date = $pieces[2];
  }
  else {
    if (substr($line,strlen($line))!=";") {
      //Not the last line of the record
      $entries .= $line
    } else {
      //last line of the record
      $entries .= substr($line,0,strlen($line)-1)

      //Do the sql statement to insert the values

      //Reset the values
      $fname = "";
      $lname = "";
      $date = "";
      $entries = "";

    }
  }
}
?>

The quality of the responses received is directly proportional to the quality of the question asked.

I do not always test the code I provide, so there may be some syntax errors. In 99% of all cases I found the solution to your problem here: http://www.php.net

#7 RabPHP

RabPHP
  • Members
  • PipPipPip
  • Advanced Member
  • 38 posts

Posted 07 November 2006 - 05:35 PM

I appreciate the help mjdamato.  I've been working with your suggestion for the past couple of days and have not been able to get it to work.  I've changed how I'm exporting the data in those attempts to try and simplify.

Here is a real example of the data now...

121222;10/05/2006 8:37AM - Notes for this loan are in the 1st file- John Doe|121223;11/02/2006 8:41PM - was told would have UW update today by asst--still didn't receive--phoned the UW--she left at 4:00-sent an email to her and the asst asking for update--told we are trying to close next week and needed an uw update on Friday urgently. Will follow up again on Friday. - John Doe
11/01/2006 11:09PM - no updates from UW yet on items sent over--will follow up on Thursday. - John Doe|121224; 11/04/2006 10:14AM Notes for this loan are in the 2nd File file- John Doe|

Here you can see the following...

Records are seperated by PIPE "|"
Fields are seperated by Semi-Colon
Entries in the last field are seperated by a New Line if there are more then 1 entry.

I can explode the records seperated by a | without a problem, but then how do I explode the parts in the record by a semi-colon and then in the last field, by a new line if one is present?

I'm in a pickle, if anyone can help please let me know.

Rab

#8 roopurt18

roopurt18
  • Staff Alumni
  • Advanced Member
  • 3,749 posts
  • LocationCalifornia, southern

Posted 07 November 2006 - 06:51 PM

If the fields that contain the newlines can also contain a semi-colon then you're really in trouble.

I recently had to battle importing dbase memo fields from Visual FoxPro into MySQL text fields.  The solution there was to use more complicated delimiters.  I separated fields with [<>] and records with <[]>.

So:
fld1[<>]fld2[<>]fld3[<>]memo field
that has
hard returns in it<[]>
fld1[<>]fld2[<>]fld3[<>]another memo
field with hard
returns
<[]>

For my import code, I did something like:
<?php
$file = file_get_contents('the/file/path.txt');
if(strlen($file)){
  $Records = explode( '<[]>', $file );
  if(is_array($Records) && count($Records)){
    foreach($Records as $Key => $Record){
      $Fields = explode( '[<>]', $Record );
      if(is_array($Fields) && count($Fields)){
        foreach($Fields as $FieldKey => $FieldVal){
          $Fields[$FieldKey] = trim($FieldVal);
        }
        $Records[$Key] = $Fields;
      }else{
        unset($Records[$Key]);
      }
    }
  }
}
?>

The fun part about that was our FoxPro programmer, who has at least 5 years experience, couldn't figure out how to export the memo fields intact.  I sat down and figured it out in 3 hours on a language I'd never even heard of until I started working here.  Go figure!

Hope that helps you out.
PHP Forms : Part I | Part II

JavaScript: Singleton

http://www.rbredlau.com

#9 Psycho

Psycho
  • Moderators
  • Move along, nothing to see here
  • 11,892 posts
  • LocationCanada

Posted 07 November 2006 - 11:49 PM

This works with your sample data. It just puts the data into an array, but you can do with it what you want.
<?php

//Open the file to be parsed
$filename = "data.txt";
if (!($fp = fopen($filename, "r"))) { die("cannot open ".$filename); }

$cleanData = array();
$currentRecord = "";

//parse the data
while ($line = fgets($fp)){

    $line = str_replace("|","|EOR|",$line);
    $data = explode("|", $line);

    foreach ($data as $value) {

        if ($value!="EOR") {
            if ($currentRecord) { $currentRecord .= ";"; }
            $currentRecord .= $value;
        } else {
            processRecord($currentRecord);
            $currentRecord = "";
	}
    } 
}

echo "<pre>";
print_r($cleanData);
echo "<pre>";


function processRecord($record) {

    global $cleanData;

    $record = explode(";", $record);
    $acctNo = $record[0];
    $cleanData[$acctNo] = array();

    for ($i=1; $i<count($record); $i++) {
        $cleanData[$acctNo][] = $record[$i];
    }
}

?>

The quality of the responses received is directly proportional to the quality of the question asked.

I do not always test the code I provide, so there may be some syntax errors. In 99% of all cases I found the solution to your problem here: http://www.php.net




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users