Jump to content

Extracting Data


adamjblakey

Recommended Posts

Hi,

 

How would i go about doing the following.

 

I have a URL like this e.g. www.website.co.uk/orgs-details.asp?OrgsID=

 

On this page there is Company Name, Contact:, Tel, E-mail: and Web site:

 

I want to extract these details and add them into a table.

 

How would i do this?

 

This is how the data is shown if this helps?

 

<table width="385" border="0" cellspacing="0" cellpadding="0"> 
              <tr> 
                <td valign="top"><h1>Company Name</h1> 
                  <table width="100%" border="0" cellspacing="0" cellpadding="3"> 
                    <tr> 
                      <th width="30%"><strong>Contact:</strong></th> 
                      <td width="70%"><strong>Persons Name 
                        </strong> </td> 
                    </tr> 
                     
                    <tr> 
                      <th><strong>Tel:</strong></th> 
                      <td><strong>000 000 000</strong></td> 
                    </tr> 
                    <tr> 
                      <td> </td> 
                      <td><span class="small">Information Here</span>.</td>

                    </tr>
                     
                    <tr> 
                      <th><strong>E-mail:</strong></th> 
                      <td><strong><a href="[email protected]">[email protected]</a></strong></td> 
                    </tr> 
                     
                    <tr> 
                      <th><strong>Web site:</strong></th> 
                      <td><strong><a href="http://www.website.com" target="_blank" id="451" onClick="return trackclick(this.id);" title="Visit Site">www.website.com</a></strong></td> 
                    </tr> 
                     
                  </table>

 

Could something like this be adapted to work that i have used in the past to extract email addresses:

 

for($i=1;$i<$max_val;$i++) {
  $content = file_get_contents('http://www.website.com/slist.php?item='.$i);
  preg_match_all($email_match_regex, $content, $matches);
      if(count($matches[0])) {
            foreach($matches[1] as $index => $value) {
                    $insert_id = mysql_query('INSERT INTO.....');
            }
     }
}

 

Cheers,

Adam

Link to comment
https://forums.phpfreaks.com/topic/94568-extracting-data/
Share on other sites

try

<?php
$a = '<table width="385" border="0" cellspacing="0" cellpadding="0"> 
              <tr> 
                <td valign="top"><h1>Company Name</h1> 
                  <table width="100%" border="0" cellspacing="0" cellpadding="3"> 
                    <tr> 
                      <th width="30%"><strong>Contact:</strong></th> 
                      <td width="70%"><strong>Persons Name 
                        </strong> </td> 
                    </tr> 
                     
                    <tr> 
                      <th><strong>Tel:</strong></th> 
                      <td><strong>000 000 000</strong></td> 
                    </tr> 
                    <tr> 
                      <td> </td> 
                      <td><span class="small">Information Here</span>.</td>

                    </tr>
                     
                    <tr> 
                      <th><strong>E-mail:</strong></th> 
                      <td><strong><a href="[email protected]">[email protected]</a></strong></td> 
                    </tr> 
                     
                    <tr> 
                      <th><strong>Web site:</strong></th> 
                      <td><strong><a href="http://www.website.com" target="_blank" id="451" onClick="return trackclick(this.id);" title="Visit Site">www.website.com</a></strong></td> 
                    </tr> 
                     
                  </table>';
preg_match_all('/E-mail[^"]+"([^@]+@[^"]+)"/',$a,$b);
print_r($b);
?>

Link to comment
https://forums.phpfreaks.com/topic/94568-extracting-data/#findComment-484397
Share on other sites

Thanks for that, but the only problem with this is that is just one table from the page so i would have to strip down everything to leave just that data.

 

I have come up with the following to do what i need. Can anyone see anything i need to do to it?

 

<?php

for($i=1;$i<3000;$i++) {

$data = file_get_contents('http://www.website.com?id='.$i);
$data = strip_tags($data,"<tr>");
$data = explode("<tr>",$data);

         foreach($data as $index => $value) {


	 	mysql_query("INSERT INTO `table` (entry, entry2, entry3, entry4, entry5) VALUES ('" . $data[12] . "', '" . $data[13] . "', '" . $data[14] . "', '" . $data[15] . "', '" . $data[16] . "')");

         }
           
}

?>

Link to comment
https://forums.phpfreaks.com/topic/94568-extracting-data/#findComment-484698
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.