Jump to content

Recommended Posts

Hey all, I'm currently trying to figure this out on my own, but just incase anyone can help speed things up... ;)

 

I'm trying to grab a page in HTML, take out a few numbers, add these numbers up and get an average, then store this somewhere (not sure yet, probably just an individual text file for now).

 

The numbers are like so (with some extra stuff before + after)....

 

<span class="sold">Sold</span></td><td class="prices bidsold g-b">£123.74</td><td class="ship ship fee">+£6.00</td><td class="time time rt">12-Apr 20:27</td>

 

(yup, eBay listings, trying to automate a price checker...). There's numerous occurances of these tags throughout the page, with differing numbers between them, but all start with a "prices bidsold g-b" class followed directly by the "ship ship fee".

 

I need to grab every occurance of every price and shipping fee for the "prices bidsold g-b" class in a page, and presumably turn them in to ints so I can perform calculations with them. I can do the first bit, but I may be doing it wrong in order for the second bit.

 

Once I have the page (after messing with some cURL auto login stuff) I:

 

preg_match_all("/\"prices bidsold g-b\">(.*?)<\/td><td class=\"time/smi",$result,$s);

foreach ($s[1] as $value) {
   echo "$value<br />\n";
}

 

This will print me out (with prices and +shipping for as many occurances of sold items there are):

 

£11.50+£0.50
£0.99+£1.00

 

Now the only thing I can think of is parse that array again with something like \d to find the decimals and store them, then add them. But a) I don't know how to parse arrays, and b) seems like a complicated way of doing things, should I (and could I) just do that kinda thing with the first preg_match_all?

 

Thanks for any help :)

EDIT: getting closer... I've managed to get all prices + shipping in to an array on their own now with no other characters....

 

preg_match_all("/\"prices bidsold g-b\">..(.*?)<\/td><td class=\"ship ship fee\">\+..(.*?)<\/td><td class=\"time/smi",$result,$s);

print_r($s); 

 

 

Array ( [0] => Array ( [0] => "prices bidsold g-b">£11.50+£0.50£0.99+£1.00 Array ( [0] => 11.50 [1] => 0.99 ) [2] => Array ( [0] => 0.50 [1] => 1.00 ) )

 

not sure where i go now though.... loop through the array adding the numbers? eg.

 

int $foo;
$foo = 0;
foreach ($s[0] as $value) {
    $foo = $foo + $value;
}

 

?

 

 

EDIT: meh, that array's horribly messy isn't it?  ??? ??? :-\

May not be pretty but this seems to work. Only tested on a single page, and you'll have to find the cURL ebay cookie login for yourself if you wanna try it as I'm not sure how much they approve of that kinda thing...

 

 

// Match all occurances of sold items, store bid and shipping seperately.
preg_match_all("/\"prices bidsold g-b\">..(.*?)<\/td><td class=\"ship ship fee\">\+..(.*?)<\/td><td class=\"time/smi",$result,$s);


// Get rid of primary matched string in array, leaving only individual price numbers.
unset($s[0][0]);
unset($s[0][1]);

// DEBUG output array.
print_r($s); 


// Function to add numbers (ints or string) in multi dimensional array. By R.Martina.

function cw_array_count($s) {
  if(!is_array($s)) return $s;
  foreach($s as $key=>$value)
     $totale += cw_array_count($value);
  return $totale; 
} 

// DEBUG output sum of array prices.

echo '<br /><br /> Sum: ';
echo cw_array_count($s); 

 

 

Outputs:

 

Array ( [0] => Array ( ) [1] => Array ( [0] => 11.50 [1] => 0.99 ) [2] => Array ( [0] => 0.50 [1] => 1.00 ) )

Sum: 13.99 

Someone's killed the editing ability after a few minutes haven't they? Ok then...

 

Change:

 

// Get rid of primary matched string in array, leaving only individual price numbers.
unset($s[0][0]);
unset($s[0][1]);

 

 

To:

 

// Get rid of primary matched string in array, leaving only individual price numbers.
unset($s[0]);

 

Now I gotta figure out how to create a variable int outside a function, increment it within a loop within said function, and reference it again outside the function. Obviously not as easy as you'd think...

 

yeah, figured out the "global" thing eventually :) though a simple "count($array)" was just as useful ;)

 

now I'm stuck trying to add the contents of two subarrays together.

 

Array
(
   [1] => Array
       (
           [0] => 0.75
           [1] => 0.99
           [2] => 0.99
           [3] => 0.99
           [4] => 0.99
           [5] => 2.99
           [6] => 3.99
           [7] => 4.99
           [8] => 0.99
       )

   [2] => Array
       (
           [0] => 0.75
           [1] => 0.89
           [2] => 0.89
           [3] => 1.00
           [4] => 0.80
           [5] => 1.00
           [6] => 1.00
           [7] => 1.00
           [8] => 1.00
       )
)

 

 

I need [1][0] and [2][0] adding, [1][1] + [2][1], etc etc, stored in a new array. [1] and [2] are constant, number of sub entries in each varies, but count($array[1]) will give me the correct number... loop for the count, $x = [1][$count] + [2][$count]... something like that... add it in to a new array... that should easy shouldn't it?

 

And all because the main parsing splits the final bids in to one array and postage in to the second, and I wanna be able to find minimum and maximum totals, and I've no idea how to do it right first parse. And I so should've taken CS + Internet coding instead of all that java nonsense....

Ok I think I pretty much have it. There's probably quicker way to do it but...

 

Parseable Data

Among other contents in the page are multiple occurances of this:

 

<div>1 Bid</div><span class="sold">Sold</span></td><td class="prices bidsold g-b">£0.75</td><td class="ship ship fee">+£0.75</td><td class="time time rt">13-Apr 11:50</td></tr>

 

Bid data can differ ("1 Bid", "X Bids", with/without sold span, differing prices) and differing times, but any sold auction will be as above with either "bid" or "bids". The below doesn't take this in to account, but if you just wanted to parse auctions and not BINs, just add something like "Bid(s)</div>..." to the start of the parse statement.

 

 

Test Data

 

As above, with prices (final winning bid + shipping)

£0.75 +£0.75

£0.99 +£0.89

£0.99 +£1.00

£0.99 +£0.80

£2.99 +£1.00

£3.99 +£1.00

£4.99 +£1.00

£0.99 +£1.00

 

 

Code

With $result being the HTML of a completed listing page.

 

 

// Match all occurances of sold items, store bid and shipping seperately.
preg_match_all("/\"prices bidsold g-b\">..(.*?)<\/td><td class=\"ship ship fee\">\+..(.*?)<\/td><td class=\"time/smi",$result,$prices_split);


// Get rid of primary matched string in array, leaving only individual price numbers.
unset($prices_split[0]);

// DEBUG output array.
print_r($prices_split); 


// Function to count numbers (ints or string) in multi dimensional array. By R.Martina.
function cw_array_count($prices_split) {
 if(!is_array($prices_split)) return $prices_split;
 foreach($prices_split as $key=>$value) {
    $totale += cw_array_count($value);
 }
 return $totale; 
} 

// DEBUG output sum of array prices.
echo '<br /><br /> Sum: ';
echo cw_array_count($prices_split); 

//DEBUG output number of subarray entries (= number of sales).
echo '<br /><br /> Count: ';
$sales = count($prices_split[1]);  
echo $sales;   


echo '<br /><br />';

// Sub array addition loop from wellho.net
// Loop through total subarray entries.
for ($k=0; $k<count($prices_split[1]); $k++) {
$sm = 0;
       // Double up loop for both subarrays.
for ($j=0; $j<count($prices_split[1]); $j++) {
               // Add [1][x] and [2][x] together.
	$sm += $prices_split[$j][$k];
}
       // Add sum to new full price array.
$prices_full[$k] = $sm;
}

// DEBUG print new full price array.
print_r($prices_full); 

 

 

Output

 

Array
(
   [1] => Array
       (
           [0] => 0.75
           [1] => 0.99
           [2] => 0.99
           [3] => 0.99
           [4] => 2.99
           [5] => 3.99
           [6] => 4.99
           [7] => 0.99
       )

   [2] => Array
       (
           [0] => 0.75
           [1] => 0.89
           [2] => 1.00
           [3] => 0.80
           [4] => 1.00
           [5] => 1.00
           [6] => 1.00
           [7] => 1.00
       )

)

Sum: 24.12

Count: 8

Array
(
   [0] => 1.5
   [1] => 1.88
   [2] => 1.99
   [3] => 1.79
   [4] => 3.99
   [5] => 4.99
   [6] => 5.99
   [7] => 1.99
)

 

Oops, had a small (or big...) bug in the sub array addition loop. Was looping though far too many options, and not the right ones, and in short... an original array with only 1 entry for price and postage wasn't being added.

 

New code with debug output showing which two sub arrays are being added: 1X, 2X being [1][X] and [2][X]:

 

	// Sub array addition loop from wellho.net
// Loop through total subarray entries.
for ($k=0; $k<count($prices_split[1]); $k++) {
	$sm = 0;
     	   	// Double up loop for both subarrays.
	for ($j=1; $j<3; $j++) {
		//DEBUG
		echo $j;
			echo $k;
		echo ', ';

		// Add [1][x] and [2][x] together.
		$sm += $prices_split[$j][$k];
	}

	//DEBUG
	echo '<br />';

       		// Add sum to new full price array.
	$prices_full[$k] = $sm;
}

And a change to the regular expression in the original parse to include "Free" postage (and stop it mucking up the rest of the array). This should now pick up everything sold but no international sellers. And luckily the sub array addition just treats "Free" as 0, otherwise I'd be in a big mess ;D

 

// Match all occurances of sold items, store bid and shipping seperately.
preg_match_all("/\"prices bidsold g-b\">..([0-9]+\.[0-9]{2})<\/td><td class=\"ship ship [a-z]+\">\+?.?.?([0-9]+\.[0-9]{2}|Free)<\/td><td class=\"time/smi",$result,$prices_split);

 

PS. mods, this is why we allow constant editing of threads, there's always some idiot who want's to keep revising to the death ;)

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.