Jump to content

Recommended Posts

I had a simple regex working that I somehow broke this morning - call it exhaustion.

 

Here's the data I'm scraping:

	<tr> 
		<td><font size="1"><b>Description:</b></font></td>
		<td  width="550">The rich contemporary style of the "Theo" Counter Height Table combines faux marble and a warm finish to create dining room furniture that adds an exciting style to the decor of any home. The thick polyurethane coated faux marble table top perfectly accentuates the warm brown finish flowing over the straight-lined contemporary design of the apron and legs to help create an exceptional dining experience. With the beautiful stitching and button tufting details of the faux leather upholstered bar stools, the "Theo" Counter Height Table is a refreshing addition to any home.</td>
	</tr>

  
	<tr bgcolor="#F8F7E4"> 
		<td><font size="1"><b>Series Features:</b></font></td>
		<td  width="550">Table top made with polyurethane coated print marble. Aprons and legs made from select veneer and solids with a warm brown finish. Chair is upholstered in a brown PVC with accent top stitching. D158-233 bar stool dimension: 18"W x 21"D x 40"H.</td>
	</tr>

       
           
	<tr> 
		<td><font size="1"><b>Printable Page:</b></font></td>
		<td  width="550">
		    <a href="javascript:LoadBrochure('D158')"><b>Click here</b> </a>to download full color page for the<b> Theo </b>series.
	    </td>

	</tr>

	<tr bgcolor="#F8F7E4"> 
	    <td><font size="1"><b>Image Downloads:</b></font></td>
		<td  width="550"><a href="../Downloads/download_results.asp?varSeriesNumber=D158&NAV=fromSeriesDetail"><b>Click here</b></a> for complete image download listing for series <b>D158</b>.</td>					    
	</tr>

 

And here is the code I'm using:

 

$url = "http://xxxxxxxxxxx.html";
$data = file_get_contents($url);

preg_match_all('/<td>.*?width="550">([^"]*)".*?<\/td>.*?width="550">([^"]*)<\/td>/is',$data,$out2);
		if ((isset($out2[1]) && isset($out2[2])) == FALSE) {	 // Let's do some error checking to see if there is data to insert into the database.  If not let's end the script
			die();
		}
$d = array_combine($out2[1], $out2[2]);
foreach($d as $k=>$v){
	echo $k . "<br>" . $v . "<br>";
}// 

 

I had it broken up where it would show:

Description: (description output)

Series Features: (series features output)

Printable Page: (printable page output)

Image Downloads: (image downloads output)

 

I had it working and accidentally overwrote my backup file.  Any help the community could give would be great.

 

It's resulting in no output at all.

 

Link to comment
https://forums.phpfreaks.com/topic/218771-pulling-my-hair-out-i-broke-the-code/
Share on other sites

Here's the error it's generating

 

Warning: array_combine() [function.array-combine]: Both parameters should have at least 1 element in /home/xxxxxxxxxxxxxxx/index.php on line 49

 

Warning: Invalid argument supplied for foreach() in /home/xxxxxxxxxxxxxxx/index.php on line 50

OK - it looks like I made it more complicated than it needs to be.

 

Here is the data I'm trying to scrape - I need what is in between the <td  width="550">DATA TO SCRAPE</td>.

 

Here's a sample of the data:

		<tr> 
		<td><font size="1"><b>Description:</b></font></td>
		<td  width="550">The rich contemporary style of the "Theo" Counter Height Table combines faux marble and a warm finish to create dining room furniture that adds an exciting style to the decor of any home. The thick polyurethane coated faux marble table top perfectly accentuates the warm brown finish flowing over the straight-lined contemporary design of the apron and legs to help create an exceptional dining experience. With the beautiful stitching and button tufting details of the faux leather upholstered bar stools, the "Theo" Counter Height Table is a refreshing addition to any home.</td>
	</tr>

  
	<tr bgcolor="#F8F7E4"> 
		<td><font size="1"><b>Series Features:</b></font></td>
		<td  width="550">Table top made with polyurethane coated print marble. Aprons and legs made from select veneer and solids with a warm brown finish. Chair is upholstered in a brown PVC with accent top stitching. D158-233 bar stool dimension: 18"W x 21"D x 40"H.</td>
	</tr>

       
           
	<tr> 
		<td><font size="1"><b>Printable Page:</b></font></td>
		<td  width="550">
		    <a href="javascript:LoadBrochure('D158')"><b>Click here</b> </a>to download full color page for the<b> Theo </b>series.
	    </td>

	</tr>

	<tr bgcolor="#F8F7E4"> 
	    <td><font size="1"><b>Image Downloads:</b></font></td>
		<td  width="550"><a href="../Downloads/download_results.asp?varSeriesNumber=D158&NAV=fromSeriesDetail"><b>Click here</b></a> for complete image download listing for series <b>D158</b>.</td>					    
	</tr>

<?php
$test = '<tr>
		<td><font size="1"><b>Description:</b></font></td>
		<td  width="550">The rich contemporary style of the "Theo" Counter Height Table combines faux marble and a warm finish to create dining room furniture that adds an exciting style to the decor of any home. The thick polyurethane coated faux marble table top perfectly accentuates the warm brown finish flowing over the straight-lined contemporary design of the apron and legs to help create an exceptional dining experience. With the beautiful stitching and button tufting details of the faux leather upholstered bar stools, the "Theo" Counter Height Table is a refreshing addition to any home.</td>
	</tr>


	<tr bgcolor="#F8F7E4">
		<td><font size="1"><b>Series Features:</b></font></td>
		<td  width="550">Table top made with polyurethane coated print marble. Aprons and legs made from select veneer and solids with a warm brown finish. Chair is upholstered in a brown PVC with accent top stitching. D158-233 bar stool dimension: 18"W x 21"D x 40"H.</td>
	</tr>



	<tr>
		<td><font size="1"><b>Printable Page:</b></font></td>
		<td  width="550">
		    <a href="javascript:LoadBrochure(\'D158\')"><b>Click here</b> </a>to download full color page for the<b> Theo </b>series.
	    </td>

	</tr>

	<tr bgcolor="#F8F7E4">
	    <td><font size="1"><b>Image Downloads:</b></font></td>
		<td  width="550"><a href="../Downloads/download_results.asp?varSeriesNumber=D158&NAV=fromSeriesDetail"><b>Click here</b></a> for complete image download listing for series <b>D158</b>.</td>
	</tr>';
preg_match_all('~<td\s+width="550">(.*?)</td>~', $test, $out);
print_r($out[1]);
?>

Many thanks!  Here is the final code I used to break down the different arrays.

 

			preg_match_all('~<td\s+width="550">(.*?)</td>~', $data, $series_description); // Series Descriptions
			$d = array_merge($series_description[0]);
			foreach($d as $k=>$v){
			echo "<p align='left'>" . $v . "</p>";
			}

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.