Jump to content

SQL help for extracting common details


webdev1
Go to solution Solved by Barand,

Recommended Posts

Hi, 

I am using php 5.4.7 & MySQL 5.5.21

 

I am working on a academic research project for analysing random sequences.

I am using data from a lottery as source data as these are random sequences of numbers up to 50.

 

I am using SQL query to do a table search on 3 random numbers (php):

 

SELECT id, 
     , GROUP_CONCAT(ball ORDER BY ball) balls
  FROM 
     ( SELECT id Ball1 ball FROM random
       UNION
       SELECT id,Ball2 ball FROM random
       UNION
       SELECT id, Ball3 ball FROM random
       UNION
       SELECT id, Ball4 ball FROM random
       UNION
      SELECT id, Ball5 ball FROM random
       UNION 
      SELECT id, Ball6 ball FROM random


     ) x
 WHERE ball IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6)
 GROUP 
    BY id 
HAVING COUNT(*) = 3

 

which works for 3 chosen numbers however how do I create an SQL statement to find 3 common sequences (without selecting the numbers) that occur more than once? I assume it maybe possible with mysql regex statement but do not know how to create this statement.

 

I hope someone can help a newbie.

 

Thanks in advance.

Edited by Zane
Link to comment
Share on other sites

Thaat query is weird, you select one Ball1 through 6 from the same record, but as separate rows, that smells like either a bad design, or you are trying to do something that I don't understand.

 

As for your question, I think it would help if you gave an example of what you are looking for.

Link to comment
Share on other sites

Thaat query is weird, you select one Ball1 through 6 from the same record, but as separate rows, that smells like either a bad design, or you are trying to do something that I don't understand.

 

Instead of normalizing the data he is using the six unioned queries to simulate the normalized table. Go figure!

Link to comment
Share on other sites

Hi thanks for replying.

I have created a table populated with id & 6 columns for the random ball numbers.

The query above does a search for 3 chosen numbers to see if they exist in the database.

I want to analyse the data to find common sets of 3 number sequences and count how many times they exist without selecting the ball numbers to count.

So a sequence of say 2 4 6 may exist more than once but I would like to know how many times.

I can then use the count number for further analysis.

Is this possible in SQL?

Link to comment
Share on other sites

Are you looking for sequences or combinations? If you are looking for 2 4 6, does 1,2,3,4,5,6 also count?

 

Your current model doesn't make this easier, what you'd want is information about the draw, the number and the position of the number in the draw.

so, 5,8,2,4,1,9 would be easier to process if it was stored as:

draw_id, number, number_position

14, 5, 1

14, 8, 2

14, 2, 3

14, 4, 4

14 ,1, 5

14, 9, 6

 

because in ordr to know that the numbers are "in a sequence" you need to know their relative positions.

I'm trying to think of an intelligent way to solve this problem, I'm getting flashes of "einsteins puzzle" which also has particular sequences and a *huge* number of permutations to check. it is solved by creatively self-joining. I'm not sure if that would work here though...

Link to comment
Share on other sites

Is this what you are after?

SELECT balls, COUNT(*) as occurence
FROM
(
    SELECT id, 
         , GROUP_CONCAT(ball ORDER BY ball) balls
      FROM 
         ( SELECT id Ball1 ball FROM random
           UNION
           SELECT id,Ball2 ball FROM random
           UNION
           SELECT id, Ball3 ball FROM random
           UNION
           SELECT id, Ball4 ball FROM random
           UNION
          SELECT id, Ball5 ball FROM random
           UNION 
          SELECT id, Ball6 ball FROM random
     
         ) x
     WHERE ball IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6)
     GROUP 
        BY id 
    HAVING COUNT(*) = 3
) y
GROUP BY balls
HAVING occurence > 1
Link to comment
Share on other sites

or, if the order is significant

SELECT balls, COUNT(*) as occurence
FROM
(
    SELECT  CONCAT_WS(',', Ball1, Ball2, Ball3) as balls,
    Ball1 as `A`, Ball2 as `B`, Ball3 as `C`
    FROM random
    UNION
    SELECT  CONCAT_WS(',', Ball2, Ball3, Ball4) as balls,
    Ball2 as `A`, Ball3 as `B`, Ball4 as `C`
    FROM random
    UNION
    SELECT  CONCAT_WS(',', Ball3, Ball4, Ball5) as balls,
    Ball3 as `A`, Ball4 as `B`, Ball5 as `C`
    FROM random
    UNION
    SELECT  CONCAT_WS(',', Ball4, Ball5, Ball6) as balls,
    Ball4 as `A`, Ball5 as `B`, Ball6 as `C`
    FROM random
) x
WHERE `A` IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6)
    AND `B` IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6)
    AND `C` IN ( $ball1, $ball2, $ball3, $ball4, $ball5, $ball6)
GROUP BY balls
HAVING occurence > 1
Link to comment
Share on other sites

This SQL query appears to be dependent upon input by selecting $ball1-6 in the WHERE section.

It will not work if I use this in mysql and $ball* variable needs a number.

I want an SQL query that does the search without any input. i.e. it finds common ball number matches with more than one instance.

Is this possible?

Link to comment
Share on other sites

Thank you however I get the error:

 

#1248 - Every derived table must have its own alias

 

when running:

 

SELECT balls, COUNT(*) as occurence
FROM
(
SELECT CONCAT_WS(',', Ball1, Ball2, Ball3) as balls,
Ball1 as `A`, Ball2 as `B`, Ball3 as `C`
FROM random
UNION
SELECT CONCAT_WS(',', Ball2, Ball3, Ball4) as balls,
Ball2 as `A`, Ball3 as `B`, Ball4 as `C`
FROM random
UNION
SELECT CONCAT_WS(',', Ball3, Ball4, Ball5) as balls,
Ball3 as `A`, Ball4 as `B`, Ball5 as `C`
FROM random
UNION
SELECT CONCAT_WS(',', Ball4, Ball5, Ball6) as balls,
Ball4 as `A`, Ball5 as `B`, Ball6 as `C`
FROM random
)

Link to comment
Share on other sites

Thank you I appreciate your help.

 

With this SQL query I get the following result:

balls     occurence

3,5,14   5069

 

I wanted a list  of more than one result and the occurrence appears to be incorrect as there are only 1845 rows in the table.

The table structure is as follows and Ball data is numeric 1-50:

id
Ball1
Ball2
Ball3
Ball4
Ball5
Ball6

Can the SQL statement be changed to give a list like this:

balls            occurence

3,5,14                99

23,34,44            55

etc

 

etc to find the 3 common combination occurrences and how many occurence times in sequential order by highest first

Edited by webdev1
Link to comment
Share on other sites

In order to get the count per combination you must add the group-by clause, and you should change the UNION to "UNION ALL" because just "UNION" will actually filter out duplicates.

 

 

SELECT balls, COUNT(*) as occurence
FROM
(
SELECT CONCAT_WS(',', Ball1, Ball2, Ball3) as balls,
Ball1 as `A`, Ball2 as `B`, Ball3 as `C`
FROM random
UNION ALL
SELECT CONCAT_WS(',', Ball2, Ball3, Ball4) as balls,
Ball2 as `A`, Ball3 as `B`, Ball4 as `C`
FROM random
UNION ALL
SELECT CONCAT_WS(',', Ball3, Ball4, Ball5) as balls,
Ball3 as `A`, Ball4 as `B`, Ball5 as `C`
FROM random
UNION ALL
SELECT CONCAT_WS(',', Ball4, Ball5, Ball6) as balls,
Ball4 as `A`, Ball5 as `B`, Ball6 as `C`
FROM random
) AS t

GROUP BY balls

Edited by vinny42
Link to comment
Share on other sites

Thank you for your help.

Your expertise is appreciated.

The results are good but not accurate, there appears to be some combinations missing which I found by random search.

A group of 3 numbers occurred 10 times and the highest result I am getting with this SQL query is 7.

Is there anything missing?

Link to comment
Share on other sites

The query should not exclude anything.

 

The only way that a combination might not bne counted correctly is if the numbers are stored as a varchar and the content is not a clean number, but has spaces, newlines etc in it. Then a "1" is different from a "1 " and a " 1".

Can you replicate the problem using a small set of testdata that you can post here?

Link to comment
Share on other sites

The table random contains id (int 5) & ball1, ball2, ball3, ball4, ball5, ball6 all of int(2) types. There are no varchar types.

I am using phpmyadmin (latest version) to do the SQL searches. 

 

Here is some test data (30 rows):

 

3 5 14 22 30 44 10
6 12 15 16 31 44 37
11 17 21 29 30 40 31
26 35 38 43 47 49 28
3 5 9 13 14 38 30
2 3 27 29 39 44 6
9 17 32 36 42 44 16
2 5 21 22 25 32 46
7 17 23 32 38 42 48
6 16 20 30 31 47 4
4 16 25 26 31 43 21
1 7 37 38 42 46 20
15 18 29 35 38 48 5
16 19 21 29 36 45 43
5 8 10 18 31 33 28
11 12 17 26 36 42 13
2 13 22 27 29 46 36
9 18 19 24 31 41 21
4 17 41 42 44 49 24
22 25 30 32 41 43 29
14 17 22 24 42 47 34
1 4 6 23 26 49 8
8 18 20 33 36 38 46
9 15 22 31 34 48 23
5 14 17 35 43 48 22
7 16 25 26 28 41 19
15 16 17 28 32 46 22
12 13 25 37 44 45 9
1 21 29 31 32 40 27
12 15 26 44 46 49 14
 
Without conducting test I think this should have more than 1 common occurrence of 3
Link to comment
Share on other sites

Thank you for your replies.

The data sent earlier was from the wrong table.

Here is the correct data (55 rows) comma delimited rows terminated with colon :

1, 3, 5, 14, 22, 30, 44:
2, 6, 12, 15, 16, 31, 44:
3, 11, 17, 21, 29, 30, 40:
4, 26, 35, 38, 43, 47, 49:
5, 3, 5, 9, 13, 14, 38:
6, 2, 3, 27, 29, 39, 44: 
7, 9, 17, 32, 36, 42, 44: 
8, 2, 5, 21, 22, 25, 32: 
9, 7, 17, 23, 32, 38, 42: 
10, 6, 16, 20, 30, 31, 47: 
11, 4, 16, 25, 26, 31, 43: 
12, 1, 7, 37, 38, 42, 46: 
13, 15, 18, 29, 35, 38, 48: 
14, 16, 19, 21, 29, 36, 45: 
15, 5, 8, 10, 18, 31, 33: 
16, 11, 12, 17, 26, 36, 42: 
17, 2, 13, 22, 27, 29, 46: 
18, 9, 18, 19, 24, 31, 41: 
19, 4, 17, 41, 42, 44, 49: 
20, 22, 25, 30, 32, 41, 43:  
21, 14, 17, 22, 24, 42, 47: 
22, 1, 4, 6, 23, 26, 49: 
23, 8, 18, 20, 33, 36, 38: 
24, 9, 15, 22, 31, 34, 48: 
25, 5, 14, 17, 35, 43, 48: 
26, 7, 16, 25, 26, 28, 41: 
27, 15, 16, 17, 28, 32, 46: 
28, 12, 13, 25, 37, 44, 45: 
29, 1, 21, 29, 31, 32, 40: 
30, 12, 15, 26, 44, 46, 49: 
31, 2, 3, 17, 23, 40, 43:
32, 2, 3, 4, 19, 23, 40: 
33, 2, 3, 17, 22, 40, 42: 
34, 10, 18, 34, 35, 39, 48: 
35, 8, 10, 26, 34, 48, 49: 
36, 10, 12, 22, 34, 46, 48: 
37, 14, 18, 19, 35, 44, 45: 
38, 15, 19, 25, 36, 44, 45: 
39, 11, 17, 19, 40, 44, 45: 
40, 19, 20, 23, 42, 44, 45: 
41, 24, 25, 26, 38, 39, 40: 
42, 2, 3, 21, 22, 23, 40:  
43, 4, 14, 15, 17, 21, 31: 
44, 14, 15, 17, 26, 35, 46: 
45, 14, 15, 17, 29, 30, 33: 
46, 9, 10, 16, 18, 27, 42: 
47, 1, 5, 9, 27, 42, 45: 
48, 9, 16, 23, 26, 27, 42: 
49, 5, 9, 27, 35, 42, 43: 
50, 9, 27, 28, 31, 41, 42: 
51, 9, 10, 11, 27, 42, 47: 
52, 9, 21, 27, 36, 42, 48: 
53, 9, 25, 27, 31, 42, 45: 
54, 9, 27, 29, 37, 42, 44: 
55, 2, 17, 23, 38, 40, 49: 
 
The first number is the id. The other 6 columns is the random data columns.
This data is a test data. There are lines with common triple results maximum should be 9.
Edited by webdev1
Link to comment
Share on other sites

Using the SQL statement discussed earllier:

SELECT balls, COUNT(*) as occurence
FROM
(
SELECT CONCAT_WS(',', Ball1, Ball2, Ball3) as balls,
Ball1 as `A`, Ball2 as `B`, Ball3 as `C`
FROM random
UNION ALL
SELECT CONCAT_WS(',', Ball2, Ball3, Ball4) as balls,
Ball2 as `A`, Ball3 as `B`, Ball4 as `C`
FROM random
UNION ALL
SELECT CONCAT_WS(',', Ball3, Ball4, Ball5) as balls,
Ball3 as `A`, Ball4 as `B`, Ball5 as `C`
FROM random
UNION ALL
SELECT CONCAT_WS(',', Ball4, Ball5, Ball6) as balls,
Ball4 as `A`, Ball5 as `B`, Ball6 as `C`
FROM random
UNION ALL
SELECT CONCAT_WS(',', Ball5, Ball6, Ball1) as balls,
Ball5 as `A`, Ball6 as `B`, Ball1 as `C`
FROM random
) AS t
 
GROUP BY balls ORDER BY occurence desc LIMIT 10
 
The highest  I get is 3 but if you do a direct search for numbers 9,27,42 you will get 9 occurrences
This proves the logic in the above SQL statement is at fault. 
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.