Jump to content

Copy all records with 2 equals data from table1 to one record in table2 (second part)


Go to solution Solved by Barand,

Recommended Posts

  • Replies 54
  • Created
  • Last Reply

Top Posters In This Topic

you are right, but i received this table from the hospital analysis lab, 

each record is an analysis and probably they want immediatly the age at the moment of the analysis...

i should talk with their dba...

 

 

 

next friday they will give me other data like therapy .... it means other queries :-))

 

thanks!

... just a question to check if i understood your queries, in particular GROUP_CONCAT statement :

i made this exercise:

i made a query in order to select patients with same surname (cognome) and counting for each surname how many they are, i differentiate them using name and codicepz.

Is it correct?

 

SELECT `Cognome` ,
COUNT(distinct `Nome`, `CodicePz`) as `Count` ,
GROUP_CONCAT(distinct `Nome`, `CodicePz` ORDER BY `Nome` SEPARATOR ' , ')
as Dates
FROM mastertot_head10000
GROUP BY `Cognome`

 

 

Thanks

"codicepz" is unique on its own, you don't need the the combination of Nome/codicepz.to count them

SELECT `Cognome` ,
  COUNT(distinct `CodicePz`) as `Count` ,
  GROUP_CONCAT(distinct `Nome`, `CodicePz` ORDER BY `Nome` SEPARATOR ' , ')
    as Names
FROM mastertot_head10000
GROUP BY `Cognome`

You don't need all those backticks. They are needed only when the identifier conflicts with a reserved word (unlikely in your tables) or if it contains a space or other special characters.

ok, thanks!

 

 

just an other curiosity,

how can i count all value != NULL in a group of columns with a single query (if it is possible)?

I refer to table result (see previous post), i would like count all values != NULL in columns a1, a2, a3

this number should be equal to number of rows of source table

Null values are ignored in aggregation functions (COUNT, SUM etc) so

SELECT 
  COUNT(a1) as count1
, COUNT(a2) as count2
, COUNT(a3) as count3
FROM tablename

mhmmm

i lost some data during import from mastertot -> mastertot2

 

mastertot: number of records: 2550171

 

mastertot2: 

SELECT
COUNT(`Acido Urico`) as count1
, COUNT(`Albumina`) as count2
, COUNT(`Azotemia`) as count3
, COUNT(`Creatinina`) as count4
, COUNT(`Glicemia`) as count5
, COUNT(`HGB Emoglobina`) as count6
, COUNT(`INR`) as count7
, COUNT(`Litio`) as count8
, COUNT(`Peptide Natriuretico Tipo B`) as count9
, COUNT(`Peptide Natriuretico Tipo B N-Terminale`) as count10
, COUNT(`Potassio`) as count11
, COUNT(`Proteina C Reattiva`) as count12
, COUNT(`Proteine`) as count13
, COUNT(`Proteine Totali`) as count14
, COUNT(`Sodio`) as count15
FROM mastertot2

 

sum of counts

2489937

 

Aggregate functions ignore only NULL values?

if you want i can share the mastertot.csv file (without patients name surname) with google drive in order to check the query which copies data from mastertot to mastertot2 ...

 

INSERT INTO mastertot2 (CodicePz,DataRichiesta,

`Acido Urico`,

`Albumina`,

`Azotemia`,

`Creatinina`,

`Glicemia`,

`HGB Emoglobina`,

`INR`,

`Litio`,

`Peptide Natriuretico Tipo B`,

`Peptide Natriuretico Tipo B N-Terminale`,

`Potassio`,

`Proteina C Reattiva`,

`Proteine`,

`Proteine Totali`,

`Sodio`)

SELECT

CodicePz

, DataRichiesta

, MIN(IF(DescrPrestaz='Acido Urico',risultato,NULL)) as 'Acido Urico'

, MIN(IF(DescrPrestaz='Albumina',risultato,NULL)) as 'Albumina'

, MIN(IF(DescrPrestaz='Azotemia',risultato,NULL)) as 'Azotemia'

, MIN(IF(DescrPrestaz='Creatinina',risultato,NULL)) as 'Creatinina'

, MIN(IF(DescrPrestaz='Glicemia',risultato,NULL)) as 'Glicemia'

, MIN(IF(DescrPrestaz='HGB Emoglobina',risultato,NULL)) as 'HGB Emoglobina'

, MIN(IF(DescrPrestaz='INR',risultato,NULL)) as 'INR'

, MIN(IF(DescrPrestaz='Litio',risultato,NULL)) as 'Litio'

, MIN(IF(DescrPrestaz='Peptide Natriuretico Tipo B',risultato,NULL)) as 'Peptide Natriuretico Tipo B'

, MIN(IF(DescrPrestaz='Peptide Natriuretico Tipo B N-Terminale',risultato,NULL)) as 'Peptide Natriuretico Tipo B N-Terminale'

, MIN(IF(DescrPrestaz='Potassio',risultato,NULL)) as 'Potassio'

, MIN(IF(DescrPrestaz='Proteina C Reattiva',risultato,NULL)) as 'Proteina C Reattiva'

, MIN(IF(DescrPrestaz='Proteine',risultato,NULL)) as 'Proteine'

, MIN(IF(DescrPrestaz='Proteine Totali',risultato,NULL)) as 'Proteine Totali'

, MIN(IF(DescrPrestaz='Sodio',risultato,NULL)) as 'Sodio'

FROM

mastertot

GROUP BY CodicePz , DataRIchiesta

 

structure della table mastertot_withoutname

 

Colonna

 

Tipo

 

Null

 

Predefinito

 

Commenti

 

MIME

 

DataRichiesta

 

date

 

Sì 

 

NULL

 

 

 

 

 

CodicePz

 

varchar(15)

 

Sì 

 

NULL

 

 

 

 

 

Genere

 

varchar(1)

 

Sì 

 

NULL

 

 

 

 

 

DataNascita

 

varchar(15)

 

Sì 

 

NULL

 

 

 

 

 

eta

 

int(11)

 

Sì 

 

NULL

 

 

 

 

 

codprestaz

 

varchar(8)

 

Sì 

 

NULL

 

 

 

 

 

DescrPrestaz

 

varchar(50)

 

Sì 

 

NULL

 

 

 

 

 

risultato

 

decimal(8,3)

 

Sì 

 

NULL

 

 

 

 

 

udm

 

varchar(20)

 

Sì 

 

NULL

 

 

 

 

 

CodImpegnativa

 

varchar(32)

 

Sì 

 

NULL

 

 

 

 

 

DataEmissione

 

varchar(9)

 

Sì 

 

NULL

 

 

 

 

 

 

 

structure table

 

mastertot2 Colonna Tipo Null Predefinito Commenti MIME CodicePz varchar(15) No        DataRichiesta date No        Acido Urico decimal(10,3) Sì  NULL     Albumina decimal(10,3) Sì  NULL     Azotemia decimal(10,3) Sì  NULL     Creatinina decimal(11,3) Sì  NULL     Glicemia decimal(11,3) Sì  NULL     HGB Emoglobina decimal(11,3) Sì  NULL     INR decimal(11,3) Sì  NULL     Litio decimal(11,3) Sì  NULL     Peptide Natriuretico Tipo B decimal(8,3) Sì  NULL     Peptide Natriuretico Tipo B N-Terminale decimal(10,3) Sì  NULL     Potassio decimal(11,3) Sì  NULL     Proteina C Reattiva decimal(11,3) Sì  NULL     Proteine decimal(10,3) Sì  NULL     Proteine Totali decimal(11,3) Sì  NULL     Sodio decimal(11,3) Sì  NULL      

One of the problems with storing descriptions instead of foreign keys in tables is they are prone to misspelling. One check you could do is

SELECT DescrPrestaz
 , COUNT(*) as sampleTotal
FROM mastertot
GROUP BY DescrPrestaz;

This will give the counts for each and will show if there are descriptions that are wrong

you are right :-)

probably there are some misspelling.

Infact your query 

 

SELECT DescrPrestaz

 , COUNT(*) as sampleTotal
FROM mastertot
GROUP BY
DescrPrestaz;

 

returns

 

Acido Urico 58330 Albumina 39201 Azotemia 205605 Creatinina 327128 Glicemia 275360 HGB Emoglobina 376942 INR 406428 Litio 1292 Peptide Natriuretico Tipo B 2426 Peptide Natriuretico Tipo B N-Terminale 8639 Potassio 268849 Proteina C Reattiva 122901 Proteine 136830 Proteine Totali 64891 Sodio 255349 SUM 2550171

 

now i have to find the misspellings....

I think I know why there are missing records.

 

Yesterday we had an instance where a patient had six tests on the same day. When we create mastertot2 we are grouping by CodicePz , DataRIchiesta so those six records are condensed to one.

Your  original post required records for each person/date

 

Table2

AccessDate___UserCode___a1__a2__a3

2012-01-01_____1________10__15_12

so you have decide what to do about those occasions where a patient has several tests for the same condition in a single day. At the  moment it takes the MIN value but you could also have MAX or AVG. Whichever one you decide, you still will have a different count between the two tables because of this date repetition.

i performed this query:

SELECT `CodicePz`, count(*) FROM `mastertot` group by `DataRichiesta`, `CodicePz`, `DescrPrestaz` ORDER BY count(*) DESC

 

now i sum the counts

Edited by paolo123

It would be usefull defining a minimum Time interval along which the 5 or more values over threashold have to stay.

I mean for example over 1 month under 6 month.

Because the case in which all 5 values over threashold are taken in same day is not clinically significative...

i am looking about adding the statement, but i think that it is different from the six month statement, because i should have a buffer to contain the records and if the last record is before 1 month i drop all the records, while if the last record is over 1 month i keep all the records, but i reach the last record only when the run ends ,,, i mean i do not know how getting previous records...

i launched the following query and the counts are ok ! :-)

 

The number of analysis executed in the same day: 116199

 

SELECT SUM(contatore)

FROM (

SELECT `CodicePz`, count(*) as contatore

FROM `mastertot`

group by `DataRichiesta`, `CodicePz`, `DescrPrestaz`

HAVING contatore>1

)as t

 

 

The number of patients with same analysis executed more then 1 time in same day: 55965

 

SELECT count(*) as NumberOfPatientsWithSameAnalysusMoreThen1TimeInSameDay

FROM (

SELECT `CodicePz`, count(*) as contatore

FROM `mastertot`

group by `DataRichiesta`, `CodicePz`, `DescrPrestaz`

HAVING contatore>1

)as t

 

116199 - 55965 = 60243 

60234 is the difference between (mastertot records) and (mastertot2 values != NULL)

2550171 - 2489937 = 60234

 

 

Thanks!!

So have I got this right?

 

Currently if a test result date is more than 6 months from the start of the run then it is not included in the run and a new run starts.

 

You now want that if a test date is less than month since the previous test then, again, it is not part of a run.

 

This will mean that a run can be no longer than 6 tests and then only if they are exactly one month apart. Longer than a month apart the run will shorter. If too long then you have no results at all as there will be less than 5 in every run.

yes you were right !  :happy-04:

 

sorry i did not explain me very well...

between the first analysis and the last analysis of the run (composed by 5+ analysis) there must be an interval > 1month.

For example:

this sequence is not considered;

1-1-2014  a1 > 10

3-1-2014   a1> 10

7-1-2014   a1> 10

19-1-2014 a1 >10

20-1-2014  a1>10

24-1-2014  a1>10

4-2-2014   a1< 10

 

This sequence is considered;

1-1-2014  a1 > 10

3-1-2014   a1> 10

7-1-2014   a1> 10

19-1-2014 a1 >10

20-1-2014  a1>10

6-2-2014  a1>10

4-2-2014   a1< 10

 

So i consider the patients with an overthreshold analysis value for a long time (>1 month in this example).

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.