Jump to content

Data capture from website via PHP


dslider

Recommended Posts

I have a page (input.php) that will allow a user to upload a CSV file. This file has 5 columns (SKU, Product, Quantity, Retail Price, and Total Retail Price). The CSV upload will only have the SKU number and Quantity filled in.

When the user hit upload the (import.php) page is suppose to go to the site and pull the product up by searching the SKU number and pulling the price and product (brand and title).

I paid a freelancer to create this code. I watched it work on his machine. I cant seem to get it to work on mine (wont pull price or product) and he is non-responsive now. Any help would be greatly appreciated!!

I added some note in the code as I was troubleshooting.

 

<?php
ini_set('max_execution_time', 0);
error_reporting(0);
move_uploaded_file($_FILES["file"]["tmp_name"], "upload/". $_FILES["file"]["name"]);
$handle = fopen("upload/". $_FILES["file"]["name"], "r");
$file = '';
$line .= "SKU,Product,Quantity,Retail Price,Total Retail Price";
$file .=  $line . PHP_EOL;
for ($i = 0; $row = fgetcsv($handle ); ++$i) {
    // Do something will $row array
	if($row[0]!="" AND $i>0)
	{
		$line="";
		#echo "<pre>";
		#print_r($row);
		$SKU=$row[0];
		$quantity=$row[2];
		$loop=1;
		do{
			$url = "https://www.homedepot.com/s/".$SKU;
			$ch = curl_init();
			curl_setopt($ch, CURLOPT_URL, $url);
			curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
			curl_setopt($ch, CURLOPT_HEADER, 1);
			$response = curl_exec($ch);
			
			$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
			$headers = substr($response, 0, $header_size);
			$body = substr($response, $header_size);
			curl_close($ch);
			
			header("Content-Type:text/plain; charset=UTF-8");
			$headers_arr = explode("\r\n", $headers);
			$str=$headers_arr[5];
			$arr=explode(":",$str);
			$check=trim($arr[0]);
			#echo $check; ### remove troubleshooting
			if($check=="location")  # made lowercase so it would get inside the If statement
			{
				#echo "Dustin"; ## remove troubleshooting
				$productPageLink=$headers_arr[5];
				$productPageLink=str_replace("Location:","",$productPageLink);
				#echo $productPageLink; ## troubleshooting -- seems to be getting the links
				$productPageLink=trim($productPageLink);
				$productPageLink=str_replace("http:","https:",$productPageLink);
				#echo $productPageLink; ## troubleshooting -- still seems to have links
				
				$ch = curl_init();	
				#echo $ch; ##troubleshooting -- prints out "resouce id"
				curl_setopt($ch, CURLOPT_URL, $productPageLink);
				#echo $ch; ##troubleshooting -- prints out "resouce id"
				curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
				curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');		
				curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');		
				$headers = array(); 
				$headers[] = 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0';
				$headers[] = 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
				$headers[] = 'Accept-Language: en-US,en;q=0.5';
				$headers[] = 'Upgrade-Insecure-Requests: 1';
				$headers[] = 'Connection: keep-alive';
				$headers[] = 'Te: Trailers';
				#echo $headers; ##troubleshooting -- ## Troubleshooting -- prints out "Array"
				curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
				
				$result = curl_exec($ch);
				#echo $result; ### troubleshooting - doesnt have any data
				if (curl_errno($ch)) {
					echo 'Error:' . curl_error($ch);  # it end inside this if statement, however no error is printed
				}
				curl_close($ch);
				preg_match_all('/<h2 class="product-title__brand" itemprop="brand" data-component="clickable brand link">(.*?)<\/h2>/s', $result, $output_array_brand);
				#echo "<pre>"; ###
				#print_r($output_array_brand);####
				
				$brand=trim(strip_tags($output_array_brand[1][0]));
				
				preg_match_all('/<h1 class="product-title__title">(.*)<\/h1>/', $result, $output_array);
				$productTitle=$output_array[1][0];
				$productTitle=$brand." ".$productTitle;
				
				preg_match_all('/<span class="price__dollars">(.*?)<\/span>/s', $result, $output_array_price);
				preg_match_all('/<span class="price__cents">(.*)<\/span>/', $result, $output_array_cent);
				#echo "<pre>";
				#print_r($output_array_price);
				$price=trim(strip_tags($output_array_price[1][0]));
				$cent=trim(strip_tags($output_array_cent[1][0]));
				if($cent!="" OR $cent!=0)
				{
					$price=$price.".".$cent;
				}
				$line.=$row[0].",";
				$line.='"'.$productTitle.'",';
				$line.=$row[2].",";
				$line.=$price.",";
				$totalPrice=$row[2]*$price;
				$line.=$totalPrice;
				$file .=  $line . PHP_EOL;   
			}
				
			# echo "<br>";
		 $loop=$loop+1;
		 #echo "<br>";
		 if($loop>4)
		 {
			 if($check!="Location")
			 {
				$line.=$row[0].",";
				$line.=',';
				$line.=$row[2].",";
				$line.=",";
				$line.="";
				$file .=  $line . PHP_EOL;   
				break;
			 }
		 }
		}
		while($check!="Location");
		
				
	}
}
fclose($handle);
header('Content-Type: application/csv'); 
$output=$_REQUEST['output'];
		header('Content-disposition: attachment; filename='.$output.'.csv');
		
		echo $file;
		#header('Content-disposition: attachment; filename='.$output.'1.csv');
		
		#echo $file1;
		exit;
?>

 

Link to comment
Share on other sites

I and apparently everyone else is totally confused by your introduction.  Can you re-word it so it makes sense?  First you say you the "input.php" script that allows a user to upload a file.  Then you say the user hits a button and something called "import.php" process a website.  Well - what is it you want help with?  If re-read your paragraph it might give you an idea of how confusing your post really is.

Waiting for some clarity.

Link to comment
Share on other sites

You might try adding this to get your script to show possible problems

error_reporting(E_ALL);
ini_set('display_errors', '1');

And if setting max execution time to 0 is meant to disable any controls over how long your script runs, I would change it.  Do you really want to tie up your server with a problem script or some very long running time?  Set it to 5 seconds or even 10, but certainly not infinity (0).

Link to comment
Share on other sites

hopefully this make a little more sense...

 

The "input.php" file code is below. This is a form that allows the user to import a cvs file(screen shot attached of how the CSV is set up). 

 

<form action="import.php" method="post" enctype="multipart/form-data">
  <table cellpadding="5" cellspacing="5" align="center" width="50%">
    <tr>
      <td>File</td>
      <td>:</td>
      <td><input type="file" name="file"></td>
    </tr>
    <tr>
      <td>Output File Name</td>
      <td>:</td>
      <td><input type="text" name="output"></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td><input type="submit" value="Upload"></td>
    </tr>
  </table>
</form>

 

 

this import gets parsed by the "Import.php" code above in the first post. 

 

It should read in the SKU number from the CSV and search the HD site for the product to pull the Price and Product data. This data for each product gets added back to the CSV as the export/output. 

 

Currently, it is not getting the product or price data for any of the items. 

Capture.PNG

Link to comment
Share on other sites

I did add a bunch of echo's... and put what the output was while I was troubleshooting (this is still in the code I provided in first post). I am not familiar enough with PHP to understand some of the output to determine if its an error or normal behavior. 

Link to comment
Share on other sites

Did you add the debugging before and during the loop process and not just the output step?  How many rows are returned by the query? Echo out the key of each row processed in the loop.  Echo out a completion message at the very end of the loop (assuming that the loop is where the output occurs).  Any output that you can't tell if it is an error or not should be posted here.  "Normal" behavior certainly doesn't look like an error!

PS - did you add the error reporting lines I gave you?

Edited by ginerjm
Link to comment
Share on other sites

I did... 

this is an example of the output... 

 

<br />
<b>Notice</b>:  Undefined variable: line in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>9</b><br />
Error:<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>77</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>80</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>87</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>88</b><br />
<br />
<b>Warning</b>:  A non-numeric value encountered in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>97</b><br />
Error:<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>77</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>80</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>87</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>88</b><br />
<br />
<b>Warning</b>:  A non-numeric value encountered in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>97</b><br />
Error:<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>77</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>80</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>87</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>88</b><br />
<br />
<b>Warning</b>:  A non-numeric value encountered in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>97</b><br />
Error:<br />

 

line 9 is this...

$line .= "SKU,Product,Quantity,Retail Price,Total Retail Price";

I added $line = ' '; before line 9. to remediate the first error. 

Edited by dslider
Link to comment
Share on other sites

Those are 'notices'.  They are telling you that something is not correct.  Show us one of those lines that have a notice.

YOu have an undefined variable $line.  Figure out why.

Line 97 has a problem.  You are using a non numeric value in some operation.  Show us that line and echo out the var prior to hitting that line so you can see why you are getting that message.

Those are quite obviously error messages.

The line 9 error is because you are concatenating that string to a variable that has not been defined yet.  Either declare it prior to that place or don't do a concat there.

Link to comment
Share on other sites

row 78 code:

print_r($output_array_brand);####
$brand = "";
$brand=trim(strip_tags($output_array_brand[1][0]));  ## this was row 78
print_r($output_array_brand);####
echo "brand = " . $brand; ###

 

here is a sample of the output:

Error:Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>78</b><br />
Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)
brand = <br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>83</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>90</b><br />
<br />
<b>Notice</b>:  Undefined offset: 0 in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>91</b><br />
<br />
<b>Warning</b>:  A non-numeric value encountered in <b>C:\xampp\htdocs\HD\import.php</b> on line <b>100</b><br />
Error:Array

 

Link to comment
Share on other sites

hummm... I added an echo before and after... I thought that was what you were wanting me to do. 

$brand=trim(strip_tags($output_array_brand[1][0]));  ## this was row 78

the above is line 78. 

 

from the output it seems nothing is in the array which is why i am getting this notice, correct? 

Edited by dslider
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.