webdev1 Posted September 2, 2013 Share Posted September 2, 2013 (edited) Hi, I am using php 5.4.7 & MySQL 5.5.21 I am working on a academic research project for analysing random sequences. I am using data from a lottery as source data as these are random sequences of numbers up to 50. I am using SQL query to do a table search on 3 random numbers (php): SELECT id, , GROUP_CONCAT(ball ORDER BY ball) balls FROM ( SELECT id Ball1 ball FROM random UNION SELECT id,Ball2 ball FROM random UNION SELECT id, Ball3 ball FROM random UNION SELECT id, Ball4 ball FROM random UNION SELECT id, Ball5 ball FROM random UNION SELECT id, Ball6 ball FROM random ) x WHERE ball IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6) GROUP BY id HAVING COUNT(*) = 3 which works for 3 chosen numbers however how do I create an SQL statement to find 3 common sequences (without selecting the numbers) that occur more than once? I assume it maybe possible with mysql regex statement but do not know how to create this statement. I hope someone can help a newbie. Thanks in advance. Edited September 4, 2013 by Zane Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 2, 2013 Share Posted September 2, 2013 Thaat query is weird, you select one Ball1 through 6 from the same record, but as separate rows, that smells like either a bad design, or you are trying to do something that I don't understand. As for your question, I think it would help if you gave an example of what you are looking for. Quote Link to comment Share on other sites More sharing options...
Barand Posted September 2, 2013 Share Posted September 2, 2013 Thaat query is weird, you select one Ball1 through 6 from the same record, but as separate rows, that smells like either a bad design, or you are trying to do something that I don't understand. Instead of normalizing the data he is using the six unioned queries to simulate the normalized table. Go figure! Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 3, 2013 Author Share Posted September 3, 2013 Hi thanks for replying. I have created a table populated with id & 6 columns for the random ball numbers. The query above does a search for 3 chosen numbers to see if they exist in the database. I want to analyse the data to find common sets of 3 number sequences and count how many times they exist without selecting the ball numbers to count. So a sequence of say 2 4 6 may exist more than once but I would like to know how many times. I can then use the count number for further analysis. Is this possible in SQL? Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 3, 2013 Share Posted September 3, 2013 Are you looking for sequences or combinations? If you are looking for 2 4 6, does 1,2,3,4,5,6 also count? Your current model doesn't make this easier, what you'd want is information about the draw, the number and the position of the number in the draw. so, 5,8,2,4,1,9 would be easier to process if it was stored as: draw_id, number, number_position 14, 5, 1 14, 8, 2 14, 2, 3 14, 4, 4 14 ,1, 5 14, 9, 6 because in ordr to know that the numbers are "in a sequence" you need to know their relative positions. I'm trying to think of an intelligent way to solve this problem, I'm getting flashes of "einsteins puzzle" which also has particular sequences and a *huge* number of permutations to check. it is solved by creatively self-joining. I'm not sure if that would work here though... Quote Link to comment Share on other sites More sharing options...
Barand Posted September 3, 2013 Share Posted September 3, 2013 Is this what you are after? SELECT balls, COUNT(*) as occurence FROM ( SELECT id, , GROUP_CONCAT(ball ORDER BY ball) balls FROM ( SELECT id Ball1 ball FROM random UNION SELECT id,Ball2 ball FROM random UNION SELECT id, Ball3 ball FROM random UNION SELECT id, Ball4 ball FROM random UNION SELECT id, Ball5 ball FROM random UNION SELECT id, Ball6 ball FROM random ) x WHERE ball IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6) GROUP BY id HAVING COUNT(*) = 3 ) y GROUP BY balls HAVING occurence > 1 Quote Link to comment Share on other sites More sharing options...
Barand Posted September 3, 2013 Share Posted September 3, 2013 or, if the order is significant SELECT balls, COUNT(*) as occurence FROM ( SELECT CONCAT_WS(',', Ball1, Ball2, Ball3) as balls, Ball1 as `A`, Ball2 as `B`, Ball3 as `C` FROM random UNION SELECT CONCAT_WS(',', Ball2, Ball3, Ball4) as balls, Ball2 as `A`, Ball3 as `B`, Ball4 as `C` FROM random UNION SELECT CONCAT_WS(',', Ball3, Ball4, Ball5) as balls, Ball3 as `A`, Ball4 as `B`, Ball5 as `C` FROM random UNION SELECT CONCAT_WS(',', Ball4, Ball5, Ball6) as balls, Ball4 as `A`, Ball5 as `B`, Ball6 as `C` FROM random ) x WHERE `A` IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6) AND `B` IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6) AND `C` IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6) GROUP BY balls HAVING occurence > 1 Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 3, 2013 Author Share Posted September 3, 2013 Thanks I will try this query. FYI the data is in numerical order for each draw. Table columns are id, ball1, ball2, ball3, ball4, ball5, ball6. Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 3, 2013 Share Posted September 3, 2013 the data is in numerical order for each draw Ok, so you are looking for how many times any random combination of three numbers occurs in all draws? That would give you results for 111 through 999, correct? Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 3, 2013 Author Share Posted September 3, 2013 This SQL query appears to be dependent upon input by selecting $ball1-6 in the WHERE section. It will not work if I use this in mysql and $ball* variable needs a number. I want an SQL query that does the search without any input. i.e. it finds common ball number matches with more than one instance. Is this possible? Quote Link to comment Share on other sites More sharing options...
Barand Posted September 3, 2013 Share Posted September 3, 2013 Then remove the WHERE clause Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 3, 2013 Author Share Posted September 3, 2013 Thank you however I get the error: #1248 - Every derived table must have its own alias when running: SELECT balls, COUNT(*) as occurenceFROM(SELECT CONCAT_WS(',', Ball1, Ball2, Ball3) as balls,Ball1 as `A`, Ball2 as `B`, Ball3 as `C`FROM randomUNIONSELECT CONCAT_WS(',', Ball2, Ball3, Ball4) as balls,Ball2 as `A`, Ball3 as `B`, Ball4 as `C`FROM randomUNIONSELECT CONCAT_WS(',', Ball3, Ball4, Ball5) as balls,Ball3 as `A`, Ball4 as `B`, Ball5 as `C`FROM randomUNIONSELECT CONCAT_WS(',', Ball4, Ball5, Ball6) as balls,Ball4 as `A`, Ball5 as `B`, Ball6 as `C`FROM random) Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 3, 2013 Share Posted September 3, 2013 You left out the 'x' which was the subquery's alias in the original query. Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 3, 2013 Author Share Posted September 3, 2013 (edited) Thank you I appreciate your help. With this SQL query I get the following result: balls occurence 3,5,14 5069 I wanted a list of more than one result and the occurrence appears to be incorrect as there are only 1845 rows in the table. The table structure is as follows and Ball data is numeric 1-50: idBall1Ball2Ball3Ball4Ball5Ball6 Can the SQL statement be changed to give a list like this: balls occurence 3,5,14 99 23,34,44 55 etc etc to find the 3 common combination occurrences and how many occurence times in sequential order by highest first Edited September 3, 2013 by webdev1 Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 3, 2013 Share Posted September 3, 2013 (edited) In order to get the count per combination you must add the group-by clause, and you should change the UNION to "UNION ALL" because just "UNION" will actually filter out duplicates. SELECT balls, COUNT(*) as occurenceFROM(SELECT CONCAT_WS(',', Ball1, Ball2, Ball3) as balls,Ball1 as `A`, Ball2 as `B`, Ball3 as `C`FROM randomUNION ALLSELECT CONCAT_WS(',', Ball2, Ball3, Ball4) as balls,Ball2 as `A`, Ball3 as `B`, Ball4 as `C`FROM randomUNION ALLSELECT CONCAT_WS(',', Ball3, Ball4, Ball5) as balls,Ball3 as `A`, Ball4 as `B`, Ball5 as `C`FROM randomUNION ALLSELECT CONCAT_WS(',', Ball4, Ball5, Ball6) as balls,Ball4 as `A`, Ball5 as `B`, Ball6 as `C`FROM random) AS t GROUP BY balls Edited September 3, 2013 by vinny42 Quote Link to comment Share on other sites More sharing options...
Barand Posted September 4, 2013 Share Posted September 4, 2013 (edited) You might want to add HAVING occurence > 1 ORDER BY occurence DESC at the end of the query Edited September 4, 2013 by Barand Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 4, 2013 Author Share Posted September 4, 2013 Thank you for your help. Your expertise is appreciated. The results are good but not accurate, there appears to be some combinations missing which I found by random search. A group of 3 numbers occurred 10 times and the highest result I am getting with this SQL query is 7. Is there anything missing? Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 4, 2013 Share Posted September 4, 2013 The query should not exclude anything. The only way that a combination might not bne counted correctly is if the numbers are stored as a varchar and the content is not a clean number, but has spaces, newlines etc in it. Then a "1" is different from a "1 " and a " 1". Can you replicate the problem using a small set of testdata that you can post here? Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 4, 2013 Author Share Posted September 4, 2013 The table random contains id (int 5) & ball1, ball2, ball3, ball4, ball5, ball6 all of int(2) types. There are no varchar types. I am using phpmyadmin (latest version) to do the SQL searches. Here is some test data (30 rows): 3 5 14 22 30 44 10 6 12 15 16 31 44 37 11 17 21 29 30 40 31 26 35 38 43 47 49 28 3 5 9 13 14 38 30 2 3 27 29 39 44 6 9 17 32 36 42 44 16 2 5 21 22 25 32 46 7 17 23 32 38 42 48 6 16 20 30 31 47 4 4 16 25 26 31 43 21 1 7 37 38 42 46 20 15 18 29 35 38 48 5 16 19 21 29 36 45 43 5 8 10 18 31 33 28 11 12 17 26 36 42 13 2 13 22 27 29 46 36 9 18 19 24 31 41 21 4 17 41 42 44 49 24 22 25 30 32 41 43 29 14 17 22 24 42 47 34 1 4 6 23 26 49 8 8 18 20 33 36 38 46 9 15 22 31 34 48 23 5 14 17 35 43 48 22 7 16 25 26 28 41 19 15 16 17 28 32 46 22 12 13 25 37 44 45 9 1 21 29 31 32 40 27 12 15 26 44 46 49 14 Without conducting test I think this should have more than 1 common occurrence of 3 Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 4, 2013 Share Posted September 4, 2013 Just to confirm, this data has seven numbers per row, the queries we've used so far have had 6.... Quote Link to comment Share on other sites More sharing options...
Barand Posted September 4, 2013 Share Posted September 4, 2013 I'm assuming the first on each row is the ID Quote Link to comment Share on other sites More sharing options...
vinny42 Posted September 4, 2013 Share Posted September 4, 2013 Then there are duplicate id's :-) Quote Link to comment Share on other sites More sharing options...
Barand Posted September 4, 2013 Share Posted September 4, 2013 So there are. Guess we need clarification from Webdev. Could do with some regarding what he's actually trying to do too. I confess to being confused about his requirements. So far I am assuming he is looking for repetitions of 3 balls drawn in sequence Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 4, 2013 Author Share Posted September 4, 2013 (edited) Thank you for your replies. The data sent earlier was from the wrong table. Here is the correct data (55 rows) comma delimited rows terminated with colon : 1, 3, 5, 14, 22, 30, 44: 2, 6, 12, 15, 16, 31, 44: 3, 11, 17, 21, 29, 30, 40: 4, 26, 35, 38, 43, 47, 49: 5, 3, 5, 9, 13, 14, 38: 6, 2, 3, 27, 29, 39, 44: 7, 9, 17, 32, 36, 42, 44: 8, 2, 5, 21, 22, 25, 32: 9, 7, 17, 23, 32, 38, 42: 10, 6, 16, 20, 30, 31, 47: 11, 4, 16, 25, 26, 31, 43: 12, 1, 7, 37, 38, 42, 46: 13, 15, 18, 29, 35, 38, 48: 14, 16, 19, 21, 29, 36, 45: 15, 5, 8, 10, 18, 31, 33: 16, 11, 12, 17, 26, 36, 42: 17, 2, 13, 22, 27, 29, 46: 18, 9, 18, 19, 24, 31, 41: 19, 4, 17, 41, 42, 44, 49: 20, 22, 25, 30, 32, 41, 43: 21, 14, 17, 22, 24, 42, 47: 22, 1, 4, 6, 23, 26, 49: 23, 8, 18, 20, 33, 36, 38: 24, 9, 15, 22, 31, 34, 48: 25, 5, 14, 17, 35, 43, 48: 26, 7, 16, 25, 26, 28, 41: 27, 15, 16, 17, 28, 32, 46: 28, 12, 13, 25, 37, 44, 45: 29, 1, 21, 29, 31, 32, 40: 30, 12, 15, 26, 44, 46, 49: 31, 2, 3, 17, 23, 40, 43: 32, 2, 3, 4, 19, 23, 40: 33, 2, 3, 17, 22, 40, 42: 34, 10, 18, 34, 35, 39, 48: 35, 8, 10, 26, 34, 48, 49: 36, 10, 12, 22, 34, 46, 48: 37, 14, 18, 19, 35, 44, 45: 38, 15, 19, 25, 36, 44, 45: 39, 11, 17, 19, 40, 44, 45: 40, 19, 20, 23, 42, 44, 45: 41, 24, 25, 26, 38, 39, 40: 42, 2, 3, 21, 22, 23, 40: 43, 4, 14, 15, 17, 21, 31: 44, 14, 15, 17, 26, 35, 46: 45, 14, 15, 17, 29, 30, 33: 46, 9, 10, 16, 18, 27, 42: 47, 1, 5, 9, 27, 42, 45: 48, 9, 16, 23, 26, 27, 42: 49, 5, 9, 27, 35, 42, 43: 50, 9, 27, 28, 31, 41, 42: 51, 9, 10, 11, 27, 42, 47: 52, 9, 21, 27, 36, 42, 48: 53, 9, 25, 27, 31, 42, 45: 54, 9, 27, 29, 37, 42, 44: 55, 2, 17, 23, 38, 40, 49: The first number is the id. The other 6 columns is the random data columns. This data is a test data. There are lines with common triple results maximum should be 9. Edited September 4, 2013 by webdev1 Quote Link to comment Share on other sites More sharing options...
webdev1 Posted September 4, 2013 Author Share Posted September 4, 2013 Using the SQL statement discussed earllier: SELECT balls, COUNT(*) as occurence FROM ( SELECT CONCAT_WS(',', Ball1, Ball2, Ball3) as balls, Ball1 as `A`, Ball2 as `B`, Ball3 as `C` FROM random UNION ALL SELECT CONCAT_WS(',', Ball2, Ball3, Ball4) as balls, Ball2 as `A`, Ball3 as `B`, Ball4 as `C` FROM random UNION ALL SELECT CONCAT_WS(',', Ball3, Ball4, Ball5) as balls, Ball3 as `A`, Ball4 as `B`, Ball5 as `C` FROM random UNION ALL SELECT CONCAT_WS(',', Ball4, Ball5, Ball6) as balls, Ball4 as `A`, Ball5 as `B`, Ball6 as `C` FROM random UNION ALL SELECT CONCAT_WS(',', Ball5, Ball6, Ball1) as balls, Ball5 as `A`, Ball6 as `B`, Ball1 as `C` FROM random ) AS t GROUP BY balls ORDER BY occurence desc LIMIT 10 The highest I get is 3 but if you do a direct search for numbers 9,27,42 you will get 9 occurrences This proves the logic in the above SQL statement is at fault. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.