Jump to content

Comparing 2 Arrays - Is there a better way to do this?


discomatt

Recommended Posts

First I'll try to explain the problem.

 

I have 2 arrays of identical size, similar to the example below

$a = array( 2, 29, 54, 77 );
$b = array( 66, 72, 102, 107 );

 

I want to go through each value of $a in reverse, and find the lowest value of $b that is greater than the current value of $a. To add to this mess, I don't want to reuse any values of $b when pairing them with values from $a

 

Here's what I've got so far

 

<pre><?php

$a = array( 2, 29, 54, 77 );
$b = array( 66, 72, 102, 107 );

// Placeholder for found pairs
$pairs = array();


// Loop through $a backwards
foreach( array_reverse($a) as $aVal ) {

/***** THIS IS THE CHUNK I THINK CAN BE IMPROVED ON *****/

// Loop through $b until we find a value less than the current $a
foreach( $b as $bKey => $bVal )
	// Check to see if the current value of b is less than the current value of a
	if( $aVal < $bVal ) {
		// If so, add it to the pairs
		$pairs[] = array( $aVal, $bVal );
		// Unset used value of b - prevents duplication and should speed things up
		unset( $b[$bKey] );
		// Break out of the loop
		break;
	}

/***** END OF CHUNK *****/

}

print_r( $pairs );

?></pre>

 

The output should look like this

 

Array
(
    [0] => Array
        (
            [0] => 77
            [1] => 102
        )

    [1] => Array
        (
            [0] => 54
            [1] => 66
        )

    [2] => Array
        (
            [0] => 29
            [1] => 72
        )

    [3] => Array
        (
            [0] => 2
            [1] => 107
        )

)

 

Any help is greatly appreciated. I'm hoping there's some cool array trick I just don't know about.

 

Thanks for lookin'

i don't know of any trick, but until you find one, i would improve the foreach($b) to make sure the $b found is indeed the smallest match, because unlike your example arrays, you may have to deal with arrays that aren't set up in pairs  ;):

// Loop through $a backwards
foreach( array_reverse($a) as $aVal ) {

/***** THIS IS THE CHUNK I THINK CAN BE IMPROVED ON *****/

        $lowestB = "init"; //initialize a variable to something that isn't a number

// Loop through $b until we find a value less than the current $a
foreach( $b as $bKey => $bVal )
	// Check to see if the current value of b is less than the current value of a
                if ($lowestB == "init") $lowestB = $bVal; //obviously, the first $b checked is the first "lowest b".
	if( ($aVal < $bVal) && ($bval <= $lowestB) ) {
                        $lowestB = $bval;
                        $lowestKey = $bkey; //for unset purposes;
	}
        $pairs[] = array( $aVal, $lowestB );
	// Unset used value of b - prevents duplication and should speed things up
	unset( $b[$lowestKey] );

/***** END OF CHUNK *****/

}

 

i hope it's not buggy - i didn't test it.

Okay, let me expand on this.

 

It's a function to match up nested pairs of strings. Array $a would be the offsets for opening tags, while array $b would be the offsets of closing tags.

 

bobbinsbro : The arrays will/should always be in ascending order... and if it wasn't, I can always use sort() before looping.

 

sasa: Yes, they should both be in ascending order at all times.

 

corbin: It's part of the algorithm. In order to find the correct matching pairs, it takes the least amount of work to start from the opening end ( deepest element ) and work your way up/out.

 

Just to clear things up, here's what I'm doing, in a wider scope. This is a crude example... the actual code I'm using is nested in a class with a ton of erroneous garbage so I shortened it up

 

[edit] oops, gotta clean a few things up before i can provide an example [/edit]

 

Thanks for your replies thus far :D

try this..


<?

$a = array( 2, 29, 54, 77);
$b = array( 66, 72, 102, 107 );

sort($a);
for ($i=0; $i <count($a); $i++)
{ //prints the array elements
  $aa[]= $a[$i];
}
sort($a);
for ($i=0; $i <count($b); $i++)
{ //prints the array elements
  $bb[]= $b[$i];
}




$j=0;

while($j<=count($aa)+2)
{

$take=array_pop($aa);
$new=array();
array_push($new,$take);

//print_r($aa);
//print_r($bb);

if( $take < current($bb))
{
  $val=current($bb);
  $keys=key($bb);
unset($bb[$keys]);
  reset($bb);
  
}else
{


  for ($i=0; $i <count($bb); $i++)
{ //prints the array elements

if(next($bb)>$take)
{

$val=current($bb);
$keys=key($bb);
unset($bb[$keys]);
reset($bb);
break;
}
}

}


array_push($new,$val);
$result[]=$new;
$j++;

}



?>
<pre><? echo print_r($result);?></pre>




Zegan, I'm not exactly sure what you're doing... but your solution seems, well, overly complex?

 

Also, you should never use

for ($i=0; $i <count($a); $i++)

Unless you expect count($a) to change.

 

You should instead use

for ($i=0, $c=count($a); $i < $c; ++$i)

Here's a working example of what I'm doing on a broader scope

 

<pre><?php

/*

1		67
20		107
43		116
77		153
125		161



*/

$data = <<<DATA
{loop}
Some data
{loop}
	Nested data
	{loop}
		More nests!
	{/loop}
	{loop}
		Another deep nest
	{/loop}
{/loop}
{loop}
	Not quite so deep
{/loop}
{/loop}
{loop}
More data
{loop}
	Foobar
{/loop}
{loop}
	Warmer
	{loop}
		Waaaarmer
		{loop}
			Too hot!
		{/loop}
	{/loop}
{/loop}
{loop}
	More depth
	{loop}
		this is getting stupid
	{/loop}
{/loop}
{/loop}
DATA;

$pairs = getPairs($data, '{loop}', '{/loop}');

foreach( $pairs as $pair )
echo substr( $data, $pair[0], $pair[1]-$pair[0] ) . "\n\n\n--------\n\n\n";

function getPairs( $data, $openTag, $closeTag ) {

// Grab opening and closing offsets
$opening = getOffsets( $data, $openTag );
$closing = getOffsets( $data, $closeTag, FALSE );

// Verify proper formatting
if( ($size = count($opening)) !== count($closing) )
			die( "Invalid syntax detected" );

// Placeholder
$offsets = array();

// Separate nests
for( $i = 1; $i < $size; $i++ ) {

	// Set initial nest key
	static $last = 0;
	// Check to see if a new nest has begun
	if( isset($opening[$i+1]) === FALSE || $opening[$i+1] > $closing[$i] ) {
		// Send nest to parser and gather results
		$offsets = array_merge( $offsets, parseNest(array_slice($opening,$last,$i-$last+1), array_slice($closing,$last,$i-$last+1)) );
		// Update nest key to the start of the next nest
		$last = $i+1;

	}

}

return $offsets;

}

function getOffsets( $data, $tag, $open = TRUE ) {

// Placeholder
$return = array();

// Start at beginning of string
$offset = 0;

// Loop through string grabbing offsets
while( ($c = strpos($data, $tag, $offset)) !== FALSE ) {
	// Update search offset
	$offset = strpos( $data, $tag, $c ) + 1;
	// Remove tag offset and add to placeholder
	$return[] = $offset - 1 + ( $open == TRUE ? strlen($tag) : -1 );
}

return $return;

}

function parseNest( $opening, $closing ) {

// Placeholder
$pairs = array();

// Loop through $opening backwards
foreach( array_reverse($opening) as $oVal ) {

	/***** THIS IS THE CHUNK I THINK CAN BE IMPROVED ON *****/

	// Loop through $closing until we find a value less than the current $opening
	foreach( $closing as $cKey => $cVal )
		// Check to see if the current opening is less than the current closing
		if( $oVal < $cVal ) {
			// If so, add it to the pairs
			$pairs[] = array( $oVal, $cVal );
			// Unset used value of b - prevents duplication and should speed things up
			unset( $closing[$cKey] );
			// Break out of the loop
			break;
		}

	/***** END OF CHUNK *****/

}

// Return them in ascending order
return array_reverse( $pairs );

}

?></pre>

try

<?php
$a = array( 2, 29, 54, 77 );
$b = array( 66, 72, 102, 107 );
$edge = array();
foreach ($a as $v) $edge[$v] = 1;
foreach ($b as $v) $edge[$v] = -1;
ksort($edge);
$count = 0;
$starts = array();
foreach ($edge as $k => $v){
$count += $v;
if ($count<0)die('Data error');
if($v < 0){
	$pair[] = array(array_pop($starts),$k);
} else array_push($starts, $k);
}
if ($count != 0) die('Data error');
print_r($pair);
?>

That works great! Neat way of doing it.

 

Speed wise, the functions are near equal in real-life situations... I did trials using random data with an average depth ~5 and my function edged out yours by a negligible time. When doing unrealistic trials ( average depth ~100 ) your function ran about 3x faster than mine.

 

Thanks for the help sasa!

try this

<pre>
<?php
$data = <<<DATA
{loop}
Some data
{loop}
	Nested data
	{loop}
		More nests!
	{/loop}
	{loop}
		Another deep nest
	{/loop}
{/loop}
{loop}
	Not quite so deep
{/loop}
{/loop}
{loop}
More data
{loop}
	Foobar
{/loop}
{loop}
	Warmer
	{loop}
		Waaaarmer
		{loop}
			Too hot!
		{/loop}
	{/loop}
{/loop}
{loop}
	More depth
	{loop}
		this is getting stupid
	{/loop}
{/loop}
{/loop}
DATA;

$pairs = my_getPairs($data, '{loop}', '{/loop}');

foreach( $pairs as $pair )
echo substr( $data, $pair[0], $pair[1]-$pair[0] ) . "\n\n\n--------\n\n\n";

function my_getPairs($data, $openTag, $closeTag){
$out = array();
$stack = array();
$offset = 0;
$l = strlen($openTag);
$a = stripos($data, $openTag, $offset);
$b = stripos($data, $closeTag, $offset);
$count = 0;
while(1){		
	if ($a===false and $b===false) break; 
	if ($a!==false and $a<$b){
		array_push($stack, $a + $l);
		$count++;
		$offset = $a + 1;
		$a = stripos($data, $openTag, $offset);
	} else {
		if ($count < 1) die('Invalid syntax detected.');
		$out[] = array(array_pop($stack), $b);
		$count--;
		$offset = $b + 1;
		$b = stripos($data, $closeTag, $offset);
	}
}
if ($count!=0) die('Invalid syntax detected');
sort($out);
return $out;
}
?>
</pre>

 

That's another neat concept, but that one is a hog, running about 10x slower than our previous functions using realistic depth, and nearly 15x slower than mine on unrealistic depths.

 

I think I'm going to stick with either your or my original function, pending a little more speed testing.

 

Thanks again for the time and thought, and helping me look outside the box :D

i speedup my code (change stripos to strpos) and do some check

<pre><?php

/*

1		67
20		107
43		116
77		153
125		161



*/

$data = <<<DATA
{loop}
Some data
{loop}
	Nested data
	{loop}
		More nests!
	{/loop}
	{loop}
		Another deep nest
	{/loop}
{/loop}
{loop}
	Not quite so deep
{/loop}
{/loop}
{loop}
More data
{loop}
	Foobar
{/loop}
{loop}
	Warmer
	{loop}
		Waaaarmer
		{loop}
			Too hot!
		{/loop}
	{/loop}
{/loop}
{loop}
	More depth
	{loop}
		this is getting stupid
	{/loop}
{/loop}
{/loop}
DATA;
$time = microtime(1);
for ($i=0; $i <100;$i++){
$pairs = getPairs($data, '{loop}', '{/loop}');
}
echo 'getPairs -     ', microtime(1) - $time,"\n";

$time = microtime(1);
for ($i=0; $i <100;$i++){
$pairs = my_getPairs($data, '{loop}', '{/loop}');
}
echo 'my_getPairs -  ', microtime(1) - $time, "\n";
$time = microtime(1);
for ($i=0; $i <100;$i++){
$pairs = my_getPairs1($data, '{loop}', '{/loop}');
}
echo 'my_getPairs1 - ',microtime(1) - $time;

//foreach( $pairs as $pair )
//	echo substr( $data, $pair[0], $pair[1]-$pair[0] ) . "\n\n\n--------\n\n\n";

function getPairs( $data, $openTag, $closeTag ) {
$opening = getOffsets( $data, $openTag );
$closing = getOffsets( $data, $closeTag, FALSE );
if( ($size = count($opening)) !== count($closing) )
			die( "Invalid syntax detected" );
$offsets = array();
for( $i = 1; $i < $size; $i++ ) {
	static $last = 0;
	if( isset($opening[$i+1]) === FALSE || $opening[$i+1] > $closing[$i] ) {
		$offsets = array_merge( $offsets, parseNest(array_slice($opening,$last,$i-$last+1), array_slice($closing,$last,$i-$last+1)) );
		$last = $i+1;
	}
}
return $offsets;
}
function getOffsets( $data, $tag, $open = TRUE ) {
$return = array();
$offset = 0;
while( ($c = strpos($data, $tag, $offset)) !== FALSE ) {
	$offset = strpos( $data, $tag, $c ) + 1;
	$return[] = $offset - 1 + ( $open == TRUE ? strlen($tag) : -1 );
}
return $return;
}
function parseNest( $opening, $closing ) {
$pairs = array();
foreach( array_reverse($opening) as $oVal ) {
	foreach( $closing as $cKey => $cVal )
		if( $oVal < $cVal ) {
			$pairs[] = array( $oVal, $cVal );
			unset( $closing[$cKey] );
			break;
		}
}
return array_reverse( $pairs );
}
function my_getPairs1($data, $openTag, $closeTag){
$out = array();
$stack = array();
$offset = 0;
$l = strlen($openTag);
$a = strpos($data, $openTag, $offset);
$b = strpos($data, $closeTag, $offset);
$count = 0;
while($a !== false or $b !== false){
	if ($a!==false and $a<$b){
		$offset = $a + $l;
		$stack[++$count] = $offset;
		$a = strpos($data, $openTag, $offset);
	} else {
		if ($count < 1) die('Invalid syntax detected.');
		$out[] = array($stack[--$count] , $b);
		$offset = $b + 1;
		$b = strpos($data, $closeTag, $offset);
	}
}
if ($count!=0) die('Invalid syntax detected');
sort($out);
return $out;
}
function my_getPairs($data, $openTag, $closeTag){
$out = array();
$stack = array();
$offset = 0;
$l = strlen($openTag);
$a = stripos($data, $openTag, $offset);
$b = stripos($data, $closeTag, $offset);
$count = 0;
while(1){		
	if ($a===false and $b===false) break; 
	if ($a!==false and $a<$b){
		array_push($stack, $a + $l);
		$count++;
		$offset = $a + 1;
		$a = stripos($data, $openTag, $offset);
	} else {
		if ($count < 1) die('Invalid syntax detected.');
		$out[] = array(array_pop($stack), $b);
		$count--;
		$offset = $b + 1;
		$b = stripos($data, $closeTag, $offset);
	}
}
if ($count!=0) die('Invalid syntax detected');
sort($out);
return $out;
}
?></pre>

output in my local wamp is

getPairs -     0.020164012908936
my_getPairs -  0.080213069915771
my_getPairs1 - 0.010028839111328

btw. way you use variable $last as static?

ups i make kistake

code must be

<pre><?php

/*

1		67
20		107
43		116
77		153
125		161



*/

$data = <<<DATA
{loop}
Some data
{loop}
	Nested data
	{loop}
		More nests!
	{/loop}
	{loop}
		Another deep nest
	{/loop}
{/loop}
{loop}
	Not quite so deep
{/loop}
{/loop}
{loop}
More data
{loop}
	Foobar
{/loop}
{loop}
	Warmer
	{loop}
		Waaaarmer
		{loop}
			Too hot!
		{/loop}
	{/loop}
{/loop}
{loop}
	More depth
	{loop}
		this is getting stupid
	{/loop}
{/loop}
{/loop}
DATA;
$time = microtime(1);
for ($i=0; $i <100;$i++){
$pairs = getPairs($data, '{loop}', '{/loop}');
}
echo 'getPairs -     ', microtime(1) - $time,"\n";

$time = microtime(1);
for ($i=0; $i <100;$i++){
$pairs = my_getPairs($data, '{loop}', '{/loop}');
}
echo 'my_getPairs -  ', microtime(1) - $time, "\n";
$time = microtime(1);
for ($i=0; $i <100;$i++){
$pairs = my_getPairs1($data, '{loop}', '{/loop}');
}
echo 'my_getPairs1 - ',microtime(1) - $time;

//foreach( $pairs as $pair )
//	echo substr( $data, $pair[0], $pair[1]-$pair[0] ) . "\n\n\n--------\n\n\n";

function getPairs( $data, $openTag, $closeTag ) {
$opening = getOffsets( $data, $openTag );
$closing = getOffsets( $data, $closeTag, FALSE );
if( ($size = count($opening)) !== count($closing) )
			die( "Invalid syntax detected" );
$offsets = array();
for( $i = 1; $i < $size; $i++ ) {
	static $last = 0;
	if( isset($opening[$i+1]) === FALSE || $opening[$i+1] > $closing[$i] ) {
		$offsets = array_merge( $offsets, parseNest(array_slice($opening,$last,$i-$last+1), array_slice($closing,$last,$i-$last+1)) );
		$last = $i+1;
	}
}
return $offsets;
}
function getOffsets( $data, $tag, $open = TRUE ) {
$return = array();
$offset = 0;
while( ($c = strpos($data, $tag, $offset)) !== FALSE ) {
	$offset = strpos( $data, $tag, $c ) + 1;
	$return[] = $offset - 1 + ( $open == TRUE ? strlen($tag) : -1 );
}
return $return;
}
function parseNest( $opening, $closing ) {
$pairs = array();
foreach( array_reverse($opening) as $oVal ) {
	foreach( $closing as $cKey => $cVal )
		if( $oVal < $cVal ) {
			$pairs[] = array( $oVal, $cVal );
			unset( $closing[$cKey] );
			break;
		}
}
return array_reverse( $pairs );
}
function my_getPairs1($data, $openTag, $closeTag){
$out = array();
$stack = array();
$offset = 0;
$l = strlen($openTag);
$a = strpos($data, $openTag, $offset);
$b = strpos($data, $closeTag, $offset);
$count = 0;
while($a !== false or $b !== false){
	if ($a!==false and $a<$b){
		$offset = $a + $l;
		$stack[++$count] = $offset;
		$a = strpos($data, $openTag, $offset);
	} else {
		if ($count < 1) die('Invalid syntax detected.');
		$out[] = array($stack[$count--] , $b); // i change this line
		$offset = $b + 1;
		$b = strpos($data, $closeTag, $offset);
	}
}
if ($count!=0) die('Invalid syntax detected');
sort($out);
return $out;
}
function my_getPairs($data, $openTag, $closeTag){
$out = array();
$stack = array();
$offset = 0;
$l = strlen($openTag);
$a = stripos($data, $openTag, $offset);
$b = stripos($data, $closeTag, $offset);
$count = 0;
while(1){		
	if ($a===false and $b===false) break; 
	if ($a!==false and $a<$b){
		array_push($stack, $a + $l);
		$count++;
		$offset = $a + 1;
		$a = stripos($data, $openTag, $offset);
	} else {
		if ($count < 1) die('Invalid syntax detected.');
		$out[] = array(array_pop($stack), $b);
		$count--;
		$offset = $b + 1;
		$b = stripos($data, $closeTag, $offset);
	}
}
if ($count!=0) die('Invalid syntax detected');
sort($out);
return $out;
}
?></pre>

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.