Jump to content

Weighted probability


Porl123

Recommended Posts

<?php

$items = array(1 => 'item 1', 2 => 'item 2', 3 => 'item 3', 4 => 'item 4', 5 => 'item 5');

?>

 

I've got an array like that and in the script it would pick an item at random, which I could use array_rand() for but I was wondering whether there'd be a way to weight the probability to a specified number? So items lower in the array would be chosen more often.

If anyone gets what I mean and can help I'd really appreciate some guidance. Thanks!

Link to comment
Share on other sites

Okay i wrote this cod for a bit of fun..

Its just an idea, theirs probably better ways but this will only be 5 if X(current 2) random number hit 5. other than that will always be the lowest of all random numbers generated  (so the more numbers generated the less chance of a high number)

 

<?php
$items = array(
1 => 'item 1', 
2 => 'item 2', 
3 => 'item 3', 
4 => 'item 4', 
5 => 'item 5'
);
$X = array();
$ID = count($items);

//The more of these the higher the chance of a lower number
$X[] = rand(1,$ID);
$X[] = rand(1,$ID);

foreach($X as $Y) if($Y<$ID) $ID= $Y;
echo $items[$ID];
?> 

 

another option is to create a larger array with the rare items appearing once and the common items appear lots of times

ie

<?php
$items = array(
array(1 => 'item 1'), 
array(1 => 'item 1'), 
array(1 => 'item 1'), 
array(1 => 'item 1'), 
array(1 => 'item 1'), 
array(1 => 'item 1'), 
array(1 => 'item 1'), 
array(2 => 'item 2'), 
array(2 => 'item 2'), 
array(3 => 'item 3'), 
array(4 => 'item 4'), 
array(4 => 'item 4'), 
array(5 => 'item 5'), 
);
$Item = $items[array_rand($items)];
$Key = current(array_keys($Item));
$Value = current($Item);
echo "$Key -> $Value";
?> 

Link to comment
Share on other sites

just incase your curious, I wrote a for loop to do this 100 times and take the average of the id, and it ranged from about 1.7-2.3

 

the code to take the average

$items = array(
1 => 'item 1', 
2 => 'item 2', 
3 => 'item 3', 
4 => 'item 4', 
5 => 'item 5'
);
$count = 0;
for ($i = 0; $i < 100; $i++){
$X = array();
$ID = count($items);

//The more of these the higher the chance of a lower number
$X[] = rand(1,$ID);
$X[] = rand(1,$ID);

foreach($X as $Y) if($Y<$ID){ $ID= $Y;}
//echo $items[$ID];
$count += $ID;

}

echo ($count / $i);

 

Link to comment
Share on other sites

here's a output from 1 to 10, with 2 random numbers

item 6, item 3, item 5, item 5, item 2, item 1, item 10, item 4, item 1, item 6, item 5, item 6, item 3, item 2, item 7, item 1, item 3, item 5, item 5, item 2, item 3, item 6, item 9, item 1, item 6, item 3, item 2, item 1, item 1, item 5, item 4, item 2, item 4, item 5, item 1, item 3, item 2, item 8, item 2, item 6, item 8, item 8, item 3, item 4, item 1, item 2, item 2, item 1, item 2, item 8, item 9, item 4, item 6, item 2, item 3, item 2, item 3, item 3, item 8, item 6, item 8, item 8, item 7, item 3, item 4, item 5, item 1, item 6, item 6, item 4, item 8, item 2, item 4, item 2, item 5, item 7, item 4, item 5, item 3, item 1, item 1, item 5, item 6, item 5, item 2, item 1, item 2, item 8, item 4, item 4, item 8, item 1, item 9, item 1, item 5, item 7, item 4, item 2, item 2, item 5

 

here 3 random numbers

item 5, item 1, item 1, item 5, item 5, item 2, item 2, item 2, item 1, item 4, item 3, item 2, item 7, item 3, item 7, item 1, item 2, item 4, item 1, item 1, item 2, item 3, item 1, item 4, item 5, item 5, item 2, item 1, item 6, item 2, item 2, item 4, item 4, item 6, item 1, item 1, item 7, item 1, item 2, item 3, item 2, item 3, item 4, item 1, item 2, item 5, item 3, item 6, item 1, item 5, item 2, item 2, item 2, item 3, item 2, item 4, item 2, item 2, item 3, item 1, item 1, item 7, item 3, item 4, item 2, item 5, item 3, item 9, item 2, item 1, item 6, item 5, item 4, item 4, item 4, item 3, item 1, item 6, item 1, item 2, item 1, item 2, item 2, item 8, item 3, item 1, item 7, item 1, item 1, item 1, item 1, item 1, item 8, item 1, item 1, item 1, item 6, item 7, item 7, item 2

 

Feel free to play with it, it was written for a little fun. :)

 

EDIT: oh the reason i used

$X[] = rand(1,$ID);

twice is so you could play with ideas, for example

$X[] = rand(1,$ID);
//if $something is true then only set to highest number so in effect only using 1 random number instead of 2
$X[] = ($something == true)?$ID:rand(1,$ID);

//if $somethingelse is true then set the highest number to 5 
$X[] = ($somethingelse == true)?rand(5,$ID):rand(1,$ID);

 

I hope you get the idea

 

I assumed this was for a game or something..  :shrug:

Link to comment
Share on other sites

Weighted Probability, isnt that simple.

Weighted probablity, you assign a weight to each item. in how common it is

 

 

<?php
  $items = array(
    array('lint',100),
    array('copper coin',75),
    array('silver coin',50),
    array('gold coin',10)
  );
  
  $item_count=count($items);
  
  // Find the total weight of all items
  $total_weight=0;
  foreach($items as $val)
  {
      $total_weight+=$val[1];
  }
  // add in percentage markers
  foreach($items as $key=>$val)
  {
      // we want to keep percentage markers as int
      // so multiplying the marker by 10,000, gives us 2 extra digits, a bit finer resolution
      $weight=(int)(($val[1]/$total_weight)*10000);
      $item_perc[]=$weight;
      $item_perc_index[$weight]=$key;
  }
  // sort our percentage markers
  sort($item_perc);
  
  // initialize our stat counter
  foreach($item_perc_index as $val)
    $stat[$val]=0;
  for($loop=0;$loop<500;$loop++)
  {
      $chance=rand(0,10000);
      // Check Against Item Percent Markers
      $marker=0;
      while($chance>$item_perc[$marker] && ($marker+1)<count($item_perc)) $marker++;
      // Get our item index from the marker
      $found=$item_perc_index[$item_perc[$marker]];
      // Statistics
      $stat[$found]++;
  }
  
  // Display Found stats for each item
  header('Content-Type: text/plain');
  echo "Loop of {$loop}\n";
  foreach($stat as $key=>$val)
  {
      echo "{$items[$key][0]} found {$val} times (%". (($val/$loop)*100) .")\n";
  }
  
  
/* Output
Loop of 500
lint found 335 times (%67)
copper coin found 60 times (%12)
silver coin found 83 times (%16.6)
gold coin found 22 times (%4.4)

*/
?>

 

Link to comment
Share on other sites

if your using a percentage then the code is simple

<?php
$items = array(
array('lint',100),
array('copper coin',75),
array('silver coin',50),
array('gold coin',10)
);
function getItem(){
global $items;
$rand = rand(0,100);
$new_items = array();
foreach($items as $i => $item){
	if($item[1] >= $rand) $new_items[] = $i;
}
$ID = array_rand($new_items);
return $ID;
}
$array = array(0,0,0,0);
for($n=0;$n<=500;$n++){
$ID = getItem();
$array[$ID]++;
//echo $items[$ID][0]." - ".$items[$ID][1]."\n";
}
foreach($array as $K => $arr){
echo $items[$K][0]."->".round($arr/500*100)."%\n";
}
?> 

Link to comment
Share on other sites

Your still not accounting for the weight system.

try using the stats, of each item.

u will see yours is pretty much random, while mine consistantly matches up with a slight variation.

you will see your code, is pretty much still random.

while in a weighted system, the percentages match.

 

$total_weight=sum(items_weight)

so your percentage of each item, depends on all the items together.

 

 

Link to comment
Share on other sites

Here's one way you can do it with percentages:

 

$items = array(
array('lint', 40),
array('copper coin', 30),
array('silver coin', 20),
array('gold coin', 10)
);

function pickItem($items)
{
$hat = array();
foreach($items as $item)
{
	$hat = array_merge($hat, array_fill(0, $item[1], $item[0]));
}
return $hat[array_rand($hat)];
}

 

Example:

 

$picks = 500;

for($i = 0;$i < $picks;$i++)
{
$picked[] = pickItem($items);
}

$counts = array_count_values($picked);
arsort($counts);

foreach($counts as $item => $freq)
{
echo "$item was picked $freq times (" . ($freq / $picks * 100) . "%)<br />\n";
}

 

Output:

 

lint was picked 195 times (39%)
copper coin was picked 155 times (31%)
silver coin was picked 96 times (19.2%)
gold coin was picked 54 times (10.8%)

Link to comment
Share on other sites

Exactly AlexWD,

 

The only difference between your code and mine, is that your item list weight system must be always 100. So you always have to adjust the values if you add new items. But its a good example of a weighted system, as your output shows, the values are approximately what you assigned the weights as.

That is why i use the system that I had shown in the example. total weight = sum(item weight). item_percentage=(total_wight/item_weight). So you can assign a system where, u have some meaning:

define('RARITY_FREQUENT',1000);
define('RARITY_COMMON',600);
define('RARITY_UNCOMMON',500);
define('RARITY_UNIQUE',250);
define('RARITY_RARE',100);
define('RARITY_LEGENDARY',10);

and the code will balance out the the system. Without readjusting all the other values.

 

But its some good code nonetheless.

 

 

Link to comment
Share on other sites

Actually, i rechecked your code again. Your system can provide for such a system as I describe above. But I would avoid using those values as described. I would reduce the weights assigned as much as I can, as it creates an array entry for each probability percentage. But its some nice code nonetheless.

 

Link to comment
Share on other sites

<?php
$items = array(
'lint' => 40,
'copper coin' => 30,
'silver coin' => 20,
'gold coin' => 10
);
echo $c = rand(1, array_sum($items));
foreach ($items as $item => $v){
if ($c > $v) $c -=$v; else break;
}
echo $item;
?>

Link to comment
Share on other sites

That wont work very well, in which you have two of the same weight.

reason for the conversion for percentages.

$items = array(
   'lint' => 40,
   'copper coin' => 30,
   'silver coin' => 15,
   'gold coin' => 5,
   'bank note' => 5
);

 

it will never display bank note, as it has the same value as gold coin, and you have to have them in highest to lowest order format.

 

Link to comment
Share on other sites

I do have a question laffin, I tried your code that you posted and noticed that the silver always displays a higher percentage then the copper coin, I do not think that is correct, but I could not see why it was doing it, either way just curious as I do find this subject interesting.

Link to comment
Share on other sites

That wont work very well, in which you have two of the same weight.

reason for the conversion for percentages.

$items = array(
   'lint' => 40,
   'copper coin' => 30,
   'silver coin' => 15,
   'gold coin' => 5,
   'bank note' => 5
);

 

it will never display bank note, as it has the same value as gold coin, and you have to have them in highest to lowest order format.

 

i try
<?php
$items = array(
   'lint' => 40,
   'copper coin' => 30,
   'silver coin' => 15,
   'gold coin' => 5,
   'bank note' => 5
);
$total = array(
   'lint' => 0,
   'copper coin' => 0,
   'silver coin' => 0,
   'gold coin' => 0,
   'bank note' => 0
);
for ($i=0;$i<2000;$i++) {
$c = rand(1, array_sum($items));
foreach ($items as $item => $v){
if ($c > $v) $c -=$v; else break;
}
$total[$item]++;
//echo $item, "\n";
}
print_r($total);
?>

and it outputs[/code]X-Powered-By: PHP/5.2.0

Content-type: text/html

 

Array

(

    [lint] => 835

    [copper coin] => 615

    [silver coin] => 342

    [gold coin] => 103

    [bank note] => 105

)[/code]

Link to comment
Share on other sites

I do have a question laffin, I tried your code that you posted and noticed that the silver always displays a higher percentage then the copper coin, I do not think that is correct, but I could not see why it was doing it, either way just curious as I do find this subject interesting.

script chose number from 0 to 10000 and

for number 0 to 424 script chose 4th option (its 424 numbers) 4.24%

for number 425 to 2126 script chose 3rd option (its 1701 numbers) 17.01%

for number 2127 to 3191 script chose 2nd option (its 1063 numbers) 10.63%

for number 3191 to 10000 script chose 1st option (its 6809 numbers) 68.09%

Link to comment
Share on other sites

Maybe I was not making myself clear or what not, but what I was getting at is that this:

 

Loop of 500
lint found 348 times (%69.6)
copper coin found 46 times (%9.2)
silver coin found 82 times (%16.4)
gold coin found 24 times (%4.

 

It is always true that silver will be retrieved more often then the copper, but yet the copper is suppose to be weighted heavier (given the 75). Is it simply because 50 is the median and will pull in more hits as appose to 75, since that is closer to one side as appose to being in the exact middle?

Link to comment
Share on other sites

Well I guess I meant that 50 is dead in between the lowest number of 0 and highest of 100. I am just trying to figure out why laffin's script does not properly weight the items, as given that copper has a weight of 75 should it not be a higher percentage then silver? My answer was that 50 would be more likely to be chosen more often since it is dead center where as the copper is more towards the upper end.

 

Just wondering :) (Sorry for the confusion I never was "great" at math problems). 

Link to comment
Share on other sites

Actually, I had some buggy code. I should have used the percentage range rather than the percentage.

but some of this talk has given rise to a similar system (with the original percentage system also included)

 

<?php
  $items = array(
    array('lint',40),
    array('copper coin',35),
    array('silver coin',15),
    array('gold coin',5),
    array('bank note',5)
  );
  
  // How many loops should we do
  $loops=500;
  
  // Find the total weight of all items
  $total_weight=0;
  foreach($items as $key=>$val)
  {
        $items[$key][2]=$total_weight;
        $total_weight=$items[$key][3]=$val[1]+$total_weight;
        
  }
  
  // initialize our stat counter
  foreach(array_keys($items) as $val)
    $stat[$val]=0;
  for($loop=0;$loop<$loops;$loop++)
  {
      $chance=rand(0,$total_weight);
      // Check Against Item Percent Markers
      $marker=0;
      foreach($items as $key=>$val)
        if($chance<=$val[3]) break;
      // Statistics
      $stat[$key]++;
  }
  
  // Display Found stats for each item
  header('Content-Type: text/plain');
  
  echo "Loop of {$loops}\n";
  echo "Just by weights\n";
  foreach($items as $key=>$val)
  {
      echo "{$val[0]} found {$stat[$key]} times (%". (($stat[$key]/$loop)*100) .") base weight ($val[1]) base percentage(%". number_format((($val[1]/$total_weight)*100),2) .")\n";
  }
  
  // original code with percentile system
  $item_count=count($items);
  
  // Find the total weight of all items
  $total_weight=0;
  foreach($items as $val)
  {
      $total_weight+=$val[1];
  }
  // add in percentage markers
  $weight=0;
  foreach($items as $key=>$val)
  {
      // we want to keep percentage markers as int
      // so multiplying the marker by 10,000, gives us 2 extra digits, a bit finer resolution
      $weight+=(int)(($val[1]/$total_weight)*10000);
      $item_perc[$val[0]]=$weight;
  }
  // sort our percentage markers
  sort($item_perc);
  
  // initialize our stat counter
  foreach(array_keys($item_perc) as $val)
    $stat[$val]=0;
  for($loop=0;$loop<$loops;$loop++)
  {
      $chance=rand(0,10000);
      // Check Against Item Percent Markers
      foreach($item_perc as $key=>$val)
        if($chance<$val) break;
      // Statistics
      $stat[$key]++;
  }
  
  // Display Found stats for each item  
  echo "\npercentage system\n";
  foreach($items as $key=>$val)
  {
      echo "{$val[0]} found {$stat[$key]} times (%". (($stat[$key]/$loop)*100) .") base weight ($val[1]) base percentage(%". number_format((($val[1]/$total_weight)*100),2) .")\n";
  }
  
?>

 

Output:

Loop of 500
Just by weights
lint found 201 times (%40.2) base weight (40) base percentage(%40.00)
copper coin found 180 times (%36) base weight (35) base percentage(%35.00)
silver coin found 72 times (%14.4) base weight (15) base percentage(%15.00)
gold coin found 19 times (%3. base weight (5) base percentage(%5.00)
bank note found 28 times (%5.6) base weight (5) base percentage(%5.00)

percentage system
lint found 185 times (%37) base weight (40) base percentage(%40.00)
copper coin found 171 times (%34.2) base weight (35) base percentage(%35.00)
silver coin found 85 times (%17) base weight (15) base percentage(%15.00)
gold coin found 28 times (%5.6) base weight (5) base percentage(%5.00)
bank note found 31 times (%6.2) base weight (5) base percentage(%5.00)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.