Jump to content

Generating random strings to all column


afaaro
Go to solution Solved by afaaro,

Recommended Posts

Hello  everyone,

 

Is there any better than this code: generating 11 characters

dbquery("UPDATE " . POST . "
      SET post_url = CONCAT(
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1),	              
          SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 1)
      )");
Link to comment
Share on other sites

I just want to point out that neither your solution nor Barand's is quite the same as your original code, in that your original allows for duplicate characters to show up, whereas the solutions presented will always have 11 unique (case-sensitive) characters. If you want to stick with how the original works, you can do this:

 

$c='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_'; 
$randstr='';
for ($x=0;$x<10;$x++)
  $randstr.=$c{rand(0,strlen($c)-1)}; 
Link to comment
Share on other sites

okay well, if you don't care about whether or not the random strings can have duplicate characters in them, then what you have now is fine.. but that does bring up another point to consider.. do you care about the possibility that the generated strings between posts could potentially be duplicate? Because the way you have it now, it is possible that row#1 and row#2 could both generate the same random string. My suggestion doesn't solve for that either, btw, but just mentioning it, in case you didn't consider that..

Link to comment
Share on other sites

If I use this way it generates same charactors to all posts

$c='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_'; 
$randstr='';
for ($x=0;$x<10;$x++)
  $randstr.=$c{rand(0,strlen($c)-1)}; 
dbquery("UPDATE ".POST." SET post_url = '$randstr'");
Edited by afaaro
Link to comment
Share on other sites

If you're willing to change the format/length of the unique string up a bit, I would suggest using UUID(). If you want to keep with the original available chars and string length, etc.. then the solution will basically involve using php to basically put my code in a while loop that doesn't end until count == [number of rows] and then updating each row 1 at a time.

Link to comment
Share on other sites

If I use this way it generates same charactors to all posts

$c='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_'; 
$randstr='';
for ($x=0;$x<10;$x++)
  $randstr.=$c{rand(0,strlen($c)-1)}; 
dbquery("UPDATE ".POST." SET post_url = '$randstr'");

 

Right, I know. I was just giving code example to generate a unique string based on your original code. With my code (and Barand's) you'd have to use a loop and update 1 row at a time.

 

But your original code logic shows you generating each character of the random string from the same pool of characters, but your new code (and also Barand's) effectively reduces the character pool for each character. IOW your original code could conceivably give you "11111111111" whereas yours and Barand's would never result in any of the characters being duplicate. If you are fine with that, then you're set - except for the next issue I brought up.

 

The next issue I brought up is in essence your concern that I just quoted : duplicate strings in rows. As it stands now, all of the solutions presented (including all of yours) could conceivably yield duplicate strings in rows. So if each row must be guaranteed to have a unique string, you're going to have to come up with a different method of generating the string, which is where my last post comes in:

 

If you're willing to change the format/length of the unique string up a bit, I would suggest using UUID(). If you want to keep with the original available chars and string length, etc.. then the solution will basically involve using php to basically put my code in a while loop that doesn't end until count == [number of rows] and then updating each row 1 at a time.

Link to comment
Share on other sites

Using this one updates all rows and generates different charactors

 

dbquery("UPDATE ".POST." SET post_url = CONCAT(SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 11))");

Right, it does update all the rows, and it seems like they are different strings if you just glance at it, and it's possible that they may in fact be unique strings. However, it's possible that this code will generate duplicate strings. If you don't believe me, run a query that selects for duplicate strings. Then run that query above again and then select query again. Wash rinse repeat and you will eventually get dupes. A more practical test would be to reduce the number of available characters and watch what happens.

Link to comment
Share on other sites

This post right here, the one I've posted and quoted twice now:

 

If you're willing to change the format/length of the unique string up a bit, I would suggest using UUID(). If you want to keep with the original available chars and string length, etc.. then the solution will basically involve using php to basically put my code in a while loop that doesn't end until count == [number of rows] and then updating each row 1 at a time.

Link to comment
Share on other sites

I have just written a 1000 records to a table using your "random" value generator, ie using

$sql = "INSERT INTO afaaro (item) VALUES 
    (CONCAT(SUBSTR('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_', FLOOR(1+RAND()*62), 11)))";

There were many duplicate value. Here are the duplicated values and the number of times they were duplicated

mysql> SELECT item, COUNT(*) as dupes
    -> FROM afaaro
    -> GROUP BY item
    -> HAVING dupes > 1;
+-------------+-------+
| item        | dupes |
+-------------+-------+
| 0123456789+ |    16 |
| 123456789+- |    15 |
| 23456789+-_ |    18 |
| 3456789+-_  |    22 |
| 456789+-_   |    17 |
| 56789+-_    |    16 |
| 6789+-_     |    20 |
| 789+-_      |    18 |
| 89+-_       |    12 |
| 9+-_        |    17 |
| ABCDEFGHIJK |    31 |
| BCDEFGHIJKL |    39 |
| CDEFGHIJKLM |    35 |
| defghijklmn |    31 |
| EFGHIJKLMNO |    43 |
| fghijklmnop |    29 |
| ghijklmnopq |    34 |
| hijklmnopqr |    32 |
| ijklmnopqrs |    34 |
| JKLMNOPQRST |    32 |
| KLMNOPQRSTU |    27 |
| lmnopqrstuv |    35 |
| mnopqrstuvw |    31 |
| nopqrstuvwx |    13 |
| opqrstuvwxy |    37 |
| PQRSTUVWXYZ |    30 |
| QRSTUVWXYZ0 |    17 |
| qrstuvwxyzA |    11 |
| RSTUVWXYZ01 |    17 |
| rstuvwxyzAB |    15 |
| STUVWXYZ012 |    14 |
| stuvwxyzABC |    15 |
| TUVWXYZ0123 |    13 |
| tuvwxyzABCD |    24 |
| UVWXYZ01234 |    16 |
| uvwxyzABCDE |    19 |
| VWXYZ012345 |    19 |
| vwxyzABCDEF |    11 |
| WXYZ0123456 |    20 |
| wxyzABCDEFG |    13 |
| XYZ01234567 |    11 |
| xyzABCDEFGH |     6 |
| YZ012345678 |    16 |
| yzABCDEFGHI |    19 |
| Z0123456789 |    21 |
| zABCDEFGHIJ |    19 |
+-------------+-------+

Also, 122 had values shorter than 11 chars

mysql> SELECT COUNT(*) as shortvals
    -> FROM afaaro
    -> WHERE LENGTH(item) < 11;
+-----------+
| shortvals |
+-----------+
|       122 |
+-----------+

FYI

Link to comment
Share on other sites

This code will generate unique, 11 char keys for each record assuming each one currently has a unique id (or other unique value) at the moment. If it hasn't, ALTER the table and add an auto_incrementing id first. You can remove it later if not required.

 

(Substitute your table and column names)

$db = new mysqli(HOST,USERNAME,PASSWORD,DATABASE); // use your credentials
error_reporting(-1);

function randomkey()
{
    $str = str_shuffle('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_');
    return substr($str, 0, 11);
}

/**************************************
* create an array
* using table ids as the keys
***************************************/
$sql = "SELECT id FROM afaaro";
$res = $db->query($sql);
while ($row = $res->fetch_row()) {
    $keys[$row[0]] = '';
}

/*************************************
* for each id, generate a unique key
**************************************/
foreach ($keys as &$v) {
    do {
        $k = randomkey();
    } while (in_array($k, $keys));
    $v = $k;
}

/*****************************************
* build case statements for table update
******************************************/
foreach ($keys as $id => $key) {
    $case[] = " WHEN id = $id THEN '$key'\n";
}

$sql = "UPDATE afaaro SET item = CASE \n" . join('', $case) . " END";
$db->query($sql) or die($db->error);

The generated update query looks like this

UPDATE afaaro 
SET item = CASE 
 WHEN id = 1 THEN 'Rbr1EMqHhU9'
 WHEN id = 2 THEN 'iaR69Cxs3n5'
 WHEN id = 3 THEN 'uOqRShXgZeb'
 WHEN id = 4 THEN '3VjtS1TUQKa'
 WHEN id = 5 THEN 'Eu+LqNpOkar'
.
.
.
 WHEN id = 995 THEN 'f7jJMVnBrLe'
 WHEN id = 996 THEN 'LIWhxlA-EQb'
 WHEN id = 997 THEN 'Niq8eEwplZG'
 WHEN id = 998 THEN 'DzkRevCmL1U'
 WHEN id = 999 THEN '+MKNiDCOum3'
 WHEN id = 1000 THEN '5mDcSgGrdU_'
 END
Link to comment
Share on other sites

Okay, let's summarize (again): Based on your (implied) requirements, your solution won't work.

 

1) Your original post implied that within a given string, there could be duplication of characters. But your current solution doesn't do this either. This was the original problem I had with your solution.

 

2) You have implied that each row must have a unique string, and your solution does not guarantee doing that.

 

3) You've also implied in your original post that you want the strings to be 11 chars long with a certain set of characters. Your current solution doesn't guarantee doing this either.

 

 

Your code fails on all 3 of these implied requirements, because all it does is grab (up to) an 11 char substring within your base string, based on a random offset. So,

 

1) Your original post had code that picks a random position in your base string 11 times. This gave you an 11 char long random string every time. However, since you had it picking from the same pool every time, it was possible that within that 11 char string, the same character could come up more than once. Overall, I'm getting the impression that you don't actually care about this, as long as the string as a whole is unique for all rows. But on that note...

 

2) At most, there will only be 65 different string combinations generated from your code, because there are only 65 positions in your base string, for your substring position to start at. So it is impossible to generate 1000 strings and not get duplicates. Check your table. You will see duplicates.

 

3) You pick a random number and use that as the offset for substring. Your base string is 65 chars long. If the random number generated for the offset is say, 60, then you're going to end up with a 5 char long string. This is why you see strings shorter than 11 chars in Barand's example, and if you look in your own table, you will likely see this as well. If you don't, then that's sheer luck, since there's only 54 combinations that will have a full 11 chars.

 

In short, your "combinations" aren't really combinations at all, nor are they unique by any means. They are simply substrings within your base string.

 

 

So you have implied 2 requirements (I'm only counting #2 and #3 above, since I don't think you really care about #1):

 

1) that the string be unique for each row

2) that the string have 11 chars and use the following characters: 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_'

 

So the question I have asked you (many times now), is whether or not you really care about requirement #2. If all you really care about is requirement #1 being fulfilled, then all you need to do is use UUID() as the generated random string. You don't need a base string that lists all the letters and numbers out; nothing. UUID() will generate a string that looks something like this: 6ccd780c-baba-1026-9564-0040f4311e29. And it will be a unique string every time it's called. And you can continue to use this for future rows and not worry about or check for collision, because it is a time-based id. It's that simple.

 

But the caveats of using UUID() is that a) it doesn't fulfill your implied 11 char string requirement, and b) it doesn't incorporate some of the characters you list in your base string. If you are fine with this, then use UUID() and you are done..

 

If, however, you really want to stick with using your base characters and have 11 char long strings, things will be significantly more complex. You will have to have code that looks like this (pseudocode)

 

$c='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-_'; 

/* 
  TODO: write some code to select all row ids from database.. 
  your table does have an id row acting as primary key / index, right?
  put them into an array called $rowIDs
*/

// generate a random string for each one, using logic to make sure there are no duplicates generated
$numRows = count($rowIDs);
$randomStrings = array();
while ( count($randomStrings) < $numRows ) {
  $randstr='';
  for ($x=0;$x<10;$x++)
    $randstr.=$c{rand(0,strlen($c)-1)}; 
  if ( !in_array($randstr,$randomStrings)
    $randomStrings[] = $randstr;
}

/* 
  TODO: Now we have an array $randomStrings that has as many unique 
  ids as there are rows. They are all 11 chars long, using your specified char pool
  Now you need to update each row individually
*/
foreach ($rowIDs as $key => $id) {
  dbquery("UPDATE ".POST." SET post_url = '{$randomStrings[$key]}' WHERE id = '{$id}'" );
}
So you will have to fill in the blanks on some of that, but that's the principle. But this only solves for updating what's in there NOW. What happens if you have more columns in the future? You will basically need to have code that generates an id string like above, but put it in a loop that keeps generating until it generates one that isn't a duplicate of what's already in there. Therefore, clearly the best, simplest route forward is to use UUID().

 

Use UUID(). Do it. You know what would be great? If there was some sort of built-in sql function that will generate a unique id. Oh wait, there is: UUID(). You cannot resist the power of the UUIDarkside. DO IT.

Link to comment
Share on other sites

I have checked this one

	$result = dbquery("SELECT category_url, COUNT(*) as dupes FROM ".CATEGORY." GROUP BY category_url HAVING dupes > 1 ");
	echo "<table><tr>";
	while($data = dbarray($result)){
		echo "<td>".$data['category_url']."</td><td>".$data['dupes']."</td>";
	}
	echo "</tr></table>";

but nothing is returning, i don't know why

Edited by afaaro
Link to comment
Share on other sites

Good. That's result you want - none with a count > 1 . That means there are no duplicates.

 

If you remove "HAVING dupes > 1" you should get them all listed with counts of 1 each

mysql> SELECT item, COUNT(*) as dupes
    -> FROM afaaro
    -> GROUP BY item
    -> HAVING dupes > 1;

Empty set (0.07 sec)
Edited by Barand
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.