Jump to content

Random number to divide visitors into two groups -- is my code wrong?


Mike521

Recommended Posts

we recently launched a new version of our internal search results page. to test the difference, I'm dividing visitors into two groups at the time they do a search. They get a cookie telling which group they're in, so they remain in it when they do searches later.

 

For some reason I'm getting about 70% in group A, and 30% in group B. I can't figure out what, if anything, is wrong with my code. Can someone review and tell me if I'm crazy?

if ( $_COOKIE["SearchTestGroup"] ) {
// they're already in a group
$group = $_COOKIE["SearchTestGroup"];
echo "you're already in group $group";
if ( $group == "B" ) {
	echo "<br>doing group B search";
	//doGroupBSearch( $theSearch );
} else {
	echo "<br>doing group A search";
	//doGroupASearch( $theSearch );
}	
} else {
// they're not in a group
$group = rand()&1;
// 0 = A, 1 = B
echo "<br><br>your random number is $group.";
echo " A value of 0 puts you in group A. 1 puts you in group B.";
if ( $group ) { // group B
	setcookie( "SearchTestGroup", "B", $cookieExpiration, "/", ".oursite.com" );
	// now perform the search
	echo "<br><br>cookie set as B";
	//doGroupBSearch( $theSearch );
} else {
	setcookie( "SearchTestGroup", "A", $cookieExpiration, "/", ".oursite.com" );
	//now perform the search
	echo "<br><br>cookie set as A";
	//doGroupASearch( $theSearch );
}
}

How many test cases have you done? Is your sample size large enough for you to even come to that conclusion? You have to also think that, just because you have a 50% chance, that's a long run 50% distribution.

I ran a modified version of the script that simply looped 1 million times and tallied up A vs B (using the method I posted earlier). it was almost exactly 50/50, so I'm confident that it's accurate.

 

Regardless, so far we have 2,296 searches. 1,539 were in group A (67%), 757 were in group B (33%).

 

everything seems correct to me but I can't figure out why there's such a big discrepancy

I was also going to suggest using the range parameters. mt_rand() is faster than rand() correct?

 

hmm I'm not sure which is faster myself. I'll switch to it, but I don't think the method is the cause of the problem, does everyone agree? If not, try this yourself:

$groupA = 0;
$groupB = 0;
for ( $i = 0; $i < 1000000; $i++ ){
$random = rand()&1;
if ( $random == 1 ) {
	//echo "<br> it's a 1 (group B)";
	$groupB++;
} else {
	if ( $random == 0 ) {
		//echo "<br> it's a 0 (group A)";
		$groupA++;
	}
}
}
echo "<br>group A: $groupA<br>group B: $groupB";

I ran your function with rand/mt_rand and with/without ranges. About 50% every time. Also included how long each took in seconds.

 

mt_rand()&1

A: 499430

B: 500570

2

 

mt_rand(0,1)

A: 500009

B: 499991

1

 

rand()&1

A: 500060

B: 499940

2

 

rand(0,1)

A: 500779

B: 499221

2

From the manual

 

Many random number generators of older libcs have dubious or unknown characteristics and are slow. By default, PHP uses the libc random number generator with the rand()  function. The mt_rand() function is a drop-in replacement for this. It uses a random number generator with known characteristics using the » Mersenne Twister, which will produce random numbers four times faster than what the average libc rand() provides.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.