Jump to content

[SOLVED] Pulling data out of web forms


brettpower

Recommended Posts

My company has charged me with the task of creating a web site that will allow contractors to "copy" and "paste" data into a web form and then submit the data to us.  The contractors will be copying data off of another web site and then pasting the data into a web form that I have created for them.

 

Is there a way in PHP to have a phone number extracted out of the pasted data and populated into it's own field in the database when the form is submitted?  The format of the phone number will be (xxx)xxx-xxxx .  My goal is to have a table with a phone number and details (the pasted data). 

 

Would it also be possible to have the first name, last name, and phone number all automatically picked out of the text area on submission?  There are two instances of both first and last name.  We would want to only pull the first instance of first name and last name.

 

 

Link to comment
Share on other sites

Hi,

 

This sounds like a tricky problem. The easiest part is the extraction of the phone number, as this can be done with a relatively simple regular expression, as long as the format of the phone number doesn't vary at all from how you expect it. The following example will extract a phone number of that format:

 

<?php
$text = "My phone number is (123)456-7890. Please dial carefully.";

$phone = preg_match ("/\([0-9]{3}\)[0-9]{3}\-[0-9]{4}/", $text, $matches);
if ($phone == 1)
{
   echo "Matched phone number: " . $matches[0] . "<br/>";
}
?>

 

If the contractor paste the phone numbers in differently (miss out the parenthetical part, or add extra whitespace), the above code will fail to find a match. You'll then have to play around with the structure a bit.

 

The rest sounds almost impossible. Unless you are imposing a strict format for name details, like a particular form field only containing names and phone number, I don't reckon you could arbitrarily extract name info. Could you post an example of what you expect this info to look like?

 

Cheers,

Darren.

Link to comment
Share on other sites

That code worked out great!  I am able to extract out only the phone number now and can search via it.  Thanks.

 

Below is the data in it's exact form as copied via CTL+A and CTL+C.  Somehow I need to get more of the data extracted, however, I am not sure how.  Line numbers won't work because every client setup is different.

 

 
  



  
       





  

  
Customer Information

  
Getting Started 
One Time Fees: 
Digital Home Advantage Activation Fee
$49.99

One Time Credits: 
Digital Home Advantage (Retail) $49.99



† One Time Cost: $49.99


Monthly Equipment Fees: 
DISH Network DVR Service Fee $5.98


Monthly Programming: 
America's Top 100 with Local Channels $34.99

Showtime Unlimited $12.99

DISH Home Protection Plan $5.99

Promotional Credits: 
DISH Home Protection Plan $5.99



†‡ Monthly Total $53.96



†Taxes are not included 
  
  




Click here for DISH Network DSL Sales.  
      





Select the appropriate link from the left side to modify this account. 
Status: CANCELLED NEW CONNECT
DISH Bill Date: 00
BTN: (386)257-2026 Customer Code: N/A 
Can be Reached Number: (360)555-5555
DISH Account Number: XXXXX09557587562

Equipment  
No Equipment Available. 
  

Service Address
Last Name: SCHMO
First Name: DARRELL
Address Line 1: 5162 ANYWHERE STREET
Address Line 2: 
City, State, ZIP: ANYTOWN, USA 98277-9607

Current Services  
.D DHA COMMIT
+9 PREMIUMPICK
?? SALE PARTNER
AA RECEIVER ACTIVATION
D- DISHPRO TWIN LNBF
}{ RETAIL DHA
}} DISH HOME PROTECTION
K: DHA18
QN ACTIVATION FEE
T( INSTALL
T$ DISH NETWORK SYSTEM
XD DISH500KIT
ZH DISH 500
Z8 110W ORBIT
2U 119W ORBIT
4W 2ND TUNR INS
$= DHA LEASED RECEIVER
AB AMERICA'S TOP 100
A6 SEATTLE WA LOCALS
*1 DISH NETWORK DVR SERVICE
D2 SHOWTIME UNLIMITED
  

Billing Address
Last Name: SCHMO
First Name: DARRELL
Address Line 1: 5162 ANYWHERE STREET
Address Line 2: 
City, State, ZIP: ANYTOWN, USA 98277-9607

Note History  
06/22/07-18:34 MST ***CMO-REFUNDS*** CONF CC REFUND FOR $49.99 WAS SENT ON BATCH #VP025607A PLEASE ALLOW   BUSINESS DAYS FOR BANK TO PROCESS

06/20/07-14:34 MST REFUND SUBMITTED DUE TO CANCELLED NEW CONNECT WORK ORDER

06/18/07-15:37 MST CANCEL WORK ORDER XXXXXX20200011003 VIA E*CONNECT CUST REQUEST TOCANCEL W/O NEEDS TO GET LANDLORD PERMISSION WILL CB IF ABLE TO GET IT AND REBUILD W/O no problem-KENT EC352380

06/14/07-18:34 MST PAYMENT OF $49.99 POSTED TO ACCOUNT ON (DEBIT) CARD:6XX4.AUTHORIZATION NUMBER IS 222851.BATCH NUMBER IS:E1215

06/14/07-18:34 MST EMAN161 - PARTNER WEB DHA SALE

  




  




©2004 EchoStar Satellite, LLC. All Rights Reserved.  

" 

Link to comment
Share on other sites

Hi,

 

This sounds like a tricky problem. The easiest part is the extraction of the phone number, as this can be done with a relatively simple regular expression, as long as the format of the phone number doesn't vary at all from how you expect it. The following example will extract a phone number of that format:

 

<?php
$text = "My phone number is (123)456-7890. Please dial carefully.";

$phone = preg_match ("/\([0-9]{3}\)[0-9]{3}\-[0-9]{4}/", $text, $matches);
if ($phone == 1)
{
   echo "Matched phone number: " . $matches[0] . "<br/>";
}
?>

 

If the contractor paste the phone numbers in differently (miss out the parenthetical part, or add extra whitespace), the above code will fail to find a match. You'll then have to play around with the structure a bit.

 

The rest sounds almost impossible. Unless you are imposing a strict format for name details, like a particular form field only containing names and phone number, I don't reckon you could arbitrarily extract name info. Could you post an example of what you expect this info to look like?

 

Cheers,

Darren.

 

Is there a way to pull the second phone number as well?

Link to comment
Share on other sites

Hi,

 

To extract all phone numbers, just replace preg_match with preg_match_all, then access each one from the $matches array. See http://uk3.php.net/manual/en/function.preg-match-all.php for more examples.

 

As for extracting more info, based on the extract you supplied, there is quite a bit of scope for interpreting data. It all depends on how different the data will be presented to you. For the data presented in the extract, here is a simple bit of code that will extract the first and last names from the service address section.

 

<?php
/**
* Set up the info. This is an extract of what would be posted in the form.
* Lines are separated by the newline character, as they would be if typed or
* pasted in.
*/
$info = "... blah ...\n";
$info .= "\n";
$info .= "Service Address\n";
$info .= "Last Name: SCHMO\n";
$info .= "First Name: DARRELL\n";
$info .= "Address Line 1: 5162 ANYWHERE STREET\n";

/* Split the info string into an array of strings, one for each line */
$lines = explode ("\n", $info);
$currLine = 0;
$totalLines = count ($lines);

/* Iterate over each line, extracting the interesting stuff. Here, we'll just
* extract the service address, but we could do the same for any other sections
* such as "Billing Address" or "Current Services". I have separated the finer
* grained processing into functions to make the main loop easier to read.
*/
while ($currLine < $totalLines)
{
   $line = $lines[$currLine++];
   if ($line == "Service Address")
   {
      getServiceAddress ($firstName, $lastName);
      echo "Fist name = " . $firstName . "<br/>";
      echo "Last name = " . $lastName . "<br/>";
   }
}

/**
* This function extracts the first and last name from the service address 
* block, and returns those values to the caller. This function is called
* when the line containing "Service Address" has been encountered.
*/
function getServiceAddress (&$firstName, &$lastName)
{
   /* Need access to the info lines, where we currently are and how many
    * lines there are in total. This info is shared with the main code so
    * the current location can be updated.
    */
   global $lines, $currLine, $totalLines;
   
   while ($currLine < $totalLines)
   {
      $line = trim ($lines[$currLine]);
      /* Search for the bits we're interested in. Here we're just using substr
       * instead of the more expensive preg_match. Less expressive, but should
       * be fine if the format is this strict.
       */
      if (substr ($line, 0, 11) == "First Name:")
      {
         /* The first name is just the last part of the line. */
         $firstName = trim (substr ($line, 12));
      }
      /* Do the same for last name */
      else if (substr ($line, 0, 10) == "Last Name:")
      {
         $lastName = trim (substr ($line, 11));
      }
      else
      {
         /* We're not interested in anything else in the block, so just return
          * to the main section.
          */
         break;
      }
      $currLine ++;
   }
}
?>

 

There are some limitations with this code - the getServiceAddress function isn't great - it won't handle the case where the 2 bits of info it wants are separated by something it doesn't like. There's no support for handling missing info. However, with a fairly well-defined structure to the data, this approach could be honed to pick out most of what you're looking for.

 

Regards,

Darren.

 

Link to comment
Share on other sites

Thanks for your help, recklessgeneral!  I have been reading over tutorials related to preg_match_all for a bit now so I should have some good results soon.

 

As far as the code goes that extracts out the names and such, where exactly does that code go?  It has to extract out of the details text area.  I am going to try and figure it out.  I will reply back if I get stuck.

 

Thanks again.

Link to comment
Share on other sites

I got the first instance of First Name to pull, however, I can't get the first instance of Last Name to pull.  My code is below.  Anyone see any mistakes?

 

<?PHP
//BEGIN FIRST LAST NAME EXTRACTION
$details = $_POST['details'];
$lname = preg_match("/\Last Name: [a-zA-Z]{1,300}/", $details, $matches);
if ($lname == 1)
{
   $matches[0];
}

$lnamedata = $matches[0];
$lname = $lnamedata; 
$_POST['lastname'] = $lnamedata;
///END FIRST LAST NAME EXTRACTION

//BEGIN FIRST NAME EXTRACTION
$details = $_POST['details'];
$fname = preg_match("/\First Name: [a-zA-Z]{1,30}/", $details, $matches);
if ($fname == 1)
{
   $matches[0];
}

$fnamedata = $matches[0];
$fname = $fnamedata; 
$_POST['fname'] = $fnamedata;
//END FIRST NAME EXTRACTION

//BEGIN FIRST PHONE NUMBER EXTRACTION
$phone = preg_match("/\([0-9]{3}\)[0-9]{3}\-[0-9]{4}/", $details, $matches);
if ($phone == 1)
{
   $matches[0];
}

$data = $matches[0];
$phone1 = $data; 
$_POST['phone'] = $data;
//END FIRST PHONE NUMBER EXTRACTION
?>

Link to comment
Share on other sites

Hi,

 

It looks like you have an error in you patterns for last name and first name (although the first name pattern isn't fatal). Remove the first backslash from the patterns, and that should get you further:

 

$lname = preg_match("/Last Name: [a-zA-Z]{1,300}/", $details, $matches);

 

The backslash is an escaping character - I used it in the regular expression for phone numbers as I wanted to search for literal parentheses. Usually, parentheses are interpreted as part of the regular expression. '\L' happens to have a special meaning also, which was confusing the regexp parser.

 

The code snipped I provided was meant as an alternative to using regular expressions all over the place. Basically, if you got rid of my $info assignments at the top and replaced it with

$info = $_POST['details];

the rest of the could replace the approach you provided. Its just an alternative way of extracting information - the way I presented allows the data to be extracted more contextually, so you know a particular name belongs to a service address or billing address for example.

 

A couple of points about your code:

1. Your if statements following your preg_match don't do anything. What is the intention of that?

2. There seems to be a lot of unnecessary reassignment of variables - $lnamedata, $lname and $_POST['lastname'] are all assigned to $matches[0] (which may or may not exist as this is accessed even if there weren't any matched).

 

Regards,

Darren.

Link to comment
Share on other sites

The if statements on the First and last name portions of the code were simply an error on my part as were the triple reassignments.

 

Here is what I have now.  I have tried searching the web for a way to have this code find the "First Name:" and "Last Name:" patterns but then strip them out at the end of the query so they are not populated into fname and lname in the database (although I don't want them stripped out of $details).  Is there a way to do this? 

 

 

<?PHP
//BEGIN FIRST NAME EXTRACTION
$details = $_POST['details'];
$fname = preg_match("/\First Name: [a-zA-Z]{1,30}/", $details, $matches);
$_POST['fname'] = $matches[0];
//END FIRST NAME EXTRACTION

//BEGIN LAST NAME EXTRACTION
$details = $_POST['details'];
$lname = preg_match("/Last Name: [a-zA-Z]{1,30}/", $details, $matches);
$_POST['lastname'] = $matches[0];
///END LAST NAME EXTRACTION

//BEGIN PHONE NUMBER EXTRACTION
$phone = preg_match("/\([0-9]{3}\)[0-9]{3}\-[0-9]{4}/", $details, $matches);
if ($phone == 1)
{
   $matches[0];
}
$_POST['phone'] = $matches[0];
//END PHONE NUMBER EXTRACTION
?>

Link to comment
Share on other sites

Still working on stripping out the extra text, however, I was able to pull both phone numbers.

 

Getting closer!

 

//BEGIN PHONE NUMBER EXTRACTION
$phone = preg_match_all("/\([0-9]{3}\)[0-9]{3}\-[0-9]{4}/", $details, $matches);
foreach ($matches as $val) {
  }
$_POST['phone'] = $val[0];
$_POST['phone2'] = $val[1];
//END PHONE NUMBER EXTRACTION

Link to comment
Share on other sites

Things are moving along, but a little slower now.  For some reason preg_math_all returns the same values for both first and last names as well as both address line 1s, address line 2s and address line 3s.

 

Also, I am in a little deep on the preg_match_all.  How do I modify my code so that it pulls both numbers and letters for the address lines?

 

Here is what I have.

 

 

//BEGIN FIRST NAME EXTRACTION
$details = $_POST['details'];
$fname = preg_match_all("/\First Name: [a-zA-Z]{1,75}/", $details, $matches);
foreach ($matches as $val) {
  }
$_POST['fname'] = $val[0];
$_POST['fname2'] = $val[1];
//END FIRST NAME EXTRACTION

//BEGIN LAST NAME EXTRACTION
$details = $_POST['details'];
$lname = preg_match_all("/\Last Name: [a-zA-Z]{1,75}/", $details, $matches);
foreach ($matches as $val) {
  }
$_POST['lname'] = $val[0];
$_POST['lname2'] = $val[1];
///END FIRST LAST NAME EXTRACTION

//BEGIN ADDRESS LINE 1
$details = $_POST['details'];
$add1 = preg_match_all("/\Address Line 1:{1,76}/", $details, $matches);
foreach ($matches as $val) {
  }
$_POST['bill_add1'] = $val[0];
$_POST['svc_add1'] = $val[1];
//END ADDRESS LINE 1

//BEGIN ADDRESS LINE 2
$details = $_POST['details'];
$add2 = preg_match_all("/\Address Line 2:{1,76}/", $details, $matches);
foreach ($matches as $val) {
  }
$_POST['bill_add2'] = $val[0];
$_POST['svc_add2'] = $val[1];
//END ADDRESS LINE 2

//BEGIN ADDRESS LINE 3
$details = $_POST['details'];
$add3 = preg_match("/\City, State, ZIP:{1,76}/", $details, $matches);
foreach ($matches as $val) {
  }
$_POST['bill_add3'] = $val[0];
$_POST['svc_add3'] = $val[1];
//END ADDRESS LINE 3

//BEGIN PHONE NUMBER EXTRACTION
$phone = preg_match_all("/\([0-9]{3}\)[0-9]{3}\-[0-9]{4}/", $details, $matches);
foreach ($matches as $val) {
  }
$_POST['phone'] = $val[0];
$_POST['phone2'] = $val[1];
//END PHONE NUMBER EXTRACTION

Link to comment
Share on other sites

Using strpos() might be easier

 

$text = $_POST['text'];
$find = array(
    'Reached Number:',
    'Account Number:',
    'Last Name:',
    'First Name:',
    'Address Line 1:',
    'Address Line 2:',
    'City, State, ZIP:',
    'Last Name:',
    'First Name:',
    'Address Line 1:',
    'Address Line 2:',
    'City, State, ZIP:');
$offset = 0;

foreach($find as $val) {
    $pos1 = strpos($text, $val, $offset) + strlen($val);
    $pos2 = strpos($text, "\n", $pos1);
    echo '<br />'.$val.' '.substr($text, $pos1 , ($pos2-$pos1) );
    $offset = $pos2;
}

 

This gets the reach phone number, service address and billing address

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.