Jump to content


Photo

Using upsert with multiple unique index/columns

sql mysql upsert unique

Best Answer kicken, 09 August 2017 - 07:44 PM

You'd write a query that gives you the results you want and use grouping to get it aggregated by day or whatever you need. Then to make it easier to manage/use you can create a view using the query.

It sounds like your current query gives you what you want so you'd just turn that into a view, probably without the date filter in the WHERE clause though. Eg:

CREATE VIEW daily_report
SELECT 
    c.extension as Extension
    , RESPONSIBLEUSEREXTENSIONID as ExtID
    , sum(Duration) as Total_Talk_Time_seconds
    , round(sum(Duration) / 60,2) as Total_Talk_Time_minutes
    , sum(if(LEGTYPE1 = 1,1,0)) as Total_Outbound   
    , sum(if(LEGTYPE1 = 2,1,0)) as Total_Inboundambition_test
    , sum(if(Answered = 1,0,1)) as Missed_Calls
    , count(DISTINCT b.NOTABLECALLID) as Total_Calls
    , DATE(b.ts) as reportDate
FROM cdrdb.session a
LEFT JOIN cdrdb.callsummary b
        ON a.NOTABLECALLID = b.NOTABLECALLID
LEFT join cdrdb.mxuser c
      ON a.RESPONSIBLEUSEREXTENSIONID = c.EXTENSIONID
WHERE 
    c.extension IN (7295,7306,7218,7247,7330,7000,7358)
GROUP BY c.extension
Then you use that to get your details for whatever day using something like
SELECT *
FROM daily_report
WHERE
    reportDate >= curdate()
Go to the full post


  • Please log in to reply
13 replies to this topic

#1 thenorman138

thenorman138
  • Members
  • PipPipPip
  • Advanced Member
  • 48 posts

Posted 09 August 2017 - 01:37 PM

Insert into test.ambition_test(Extension, ExtID, Total_Talk_Time_seconds,
    Total_Talk_Time_minutes,Total_Outbound, Total_Inbound, 
    Missed_Calls, Total_Calls, Date_of_report, Time_of_report)  
SELECT 
     c.extension as Extension
    ,RESPONSIBLEUSEREXTENSIONID as ExtID
    , sum(Duration) as Total_Talk_Time_seconds
    , round(sum(Duration) / 60,2) as Total_Talk_Time_minutes
    , sum(if(LEGTYPE1 = 1,1,0)) as Total_Outbound   
    , sum(if(LEGTYPE1 = 2,1,0)) as Total_Inboundambition_test
    , sum(if(Answered = 1,0,1)) as Missed_Calls
    , count(DISTINCT b.NOTABLECALLID) as Total_Calls
    , DATE(NOW()) 
    , NOW()
    FROM cdrdb.session a
    LEFT JOIN cdrdb.callsummary b
            ON a.NOTABLECALLID = b.NOTABLECALLID
   LEFT join cdrdb.mxuser c
          ON a.RESPONSIBLEUSEREXTENSIONID = c.EXTENSIONID
   WHERE b.ts >= curdate()
   AND c.extension IN (7295,7306,7218,7247,7330,7000,7358)
    group by c.extension
ON duplicate key update Total_Talk_Time_seconds =values(Total_Talk_Time_seconds), 
    Total_Talk_Time_minutes =values(Total_Talk_Time_minutes), 
    Total_Outbound = values(Total_Outbound), Total_Inbound = values(Total_Inbound), 
    Missed_calls = values(Missed_Calls), 
    Total_Calls = values(Total_Calls), 
    Date_of_report = values(Date_of_report),
    Time_of_report = values(Time_of_report);  

The above query is what I have running in MySQL workbench currently. It's pulling and joining several tables to get call data and form metrics for 7 CSR agents on our phone system. IT pulls the data and forms the metrics correctly and it also updates throughout the day properly, but I have a huge problem: It will not insert new records on each new day, it still overwrites per day whereas I want to keep a history. 

 

I created a unique index on the Primary key Extension as well as the Date_of_report column as well.

 

To specify, this query runs every 15 minutes to add/aggregate the metric totals based on the extension, however, the next day we should be inserting 7 new records and then have those aggregate through the day. Basically by the end of 3 days, we would have 21 records. However, this just continues to overwrite every day.

 

Any ideas or tips here?



#2 requinix

requinix
  • Administrators
  • Maddening Administrator
  • 9,561 posts
  • LocationWA

Posted 09 August 2017 - 02:05 PM

You don't need more constraints - you have too many already, because MySQL is choosing the wrong duplicate row to update.

Make sure the query is only possibly creating records that would violate the date_of_report constraint. Look at what you have that is enforced unique and decide how the data fits into it.
The Reimann Zeta Function Trolley Problem | "Summer is when I, the great ice fairy, can show my true power!"

#3 thenorman138

thenorman138
  • Members
  • PipPipPip
  • Advanced Member
  • 48 posts

Posted 09 August 2017 - 02:18 PM

That's where I'm slightly confused. In my limited time with SQL, this is the first time I've had to use an upsert with multiple index columns. I have to treat extension as the PK and unique for duplication, but I also need to use the date so that each day will start a new batch of records and the previous day will be left with the End of day totals. What do you suggest I do differently with the constraints?



#4 requinix

requinix
  • Administrators
  • Maddening Administrator
  • 9,561 posts
  • LocationWA

Posted 09 August 2017 - 02:34 PM

Either
a) The extension is unique, in which case the query should be updating existing rows with a given value regardless of the date_of_report, or
b) The extension is not unique, in which case it shouldn't be the primary key (at least not by itself)

What are all the indexes on the table?
The Reimann Zeta Function Trolley Problem | "Summer is when I, the great ice fairy, can show my true power!"

#5 thenorman138

thenorman138
  • Members
  • PipPipPip
  • Advanced Member
  • 48 posts

Posted 09 August 2017 - 04:22 PM

So the only indexes on the table are PRIMARY for 'Extension' and UNIQUE for 'Extension' and 'Date_of_report'. I've been using only extension as the primary key for the agents, but I need date to dictate when the new records are inserted (by date).



#6 Jacques1

Jacques1
  • Members
  • PipPipPip
  • Turtles all the way down
  • 4,224 posts

Posted 09 August 2017 - 04:38 PM

I don't think you understand what a primary key is.

 

The primary key is the unique identifier for the rows. If an extension can appear multiple times (on different dates), then it cannot be the primary key by definition. It may be important for various lookups, but it's definitely not a row identifier.

 

The whole table doesn't make a lot of sense. When this is all just aggregated data from existing tables, why even have this extra table? Is it supposed to be some kind of cache to solve performance problems?



#7 thenorman138

thenorman138
  • Members
  • PipPipPip
  • Advanced Member
  • 48 posts

Posted 09 August 2017 - 04:55 PM

We are using a phone system for our CSR agents and the phone system database is all over the place so we found the data we needed from it by joining certain tables and using data from each table to put in the formulas, that way we get 6 columns of our most important metrics. This runs every 15 minutes to give us an updated log of those metrics that we give to another team for incentive based score-keeping, so that's why we need it in a small, stand-alone table.

 

So, I see what you're saying about making an arbitrary auto incremented ID for the primary key, but then would I just make extension and date_of_report together a unique index?



#8 Jacques1

Jacques1
  • Members
  • PipPipPip
  • Turtles all the way down
  • 4,224 posts

Posted 09 August 2017 - 05:08 PM

You don't need a table to aggregate data. There are views.

 

I see what you're saying about making an arbitrary auto incremented ID for the primary key

 

I didn't say that at all. I said that the “primary key” you've chosen isn't a primary key, which means you need a different one. Like the combination of the extension and the date (assuming this does in fact identify the rows).



#9 thenorman138

thenorman138
  • Members
  • PipPipPip
  • Advanced Member
  • 48 posts

Posted 09 August 2017 - 05:12 PM

Sorry, didn't mean to put it that way. I know you didn't say that, but I meant I could do that and that would go along with the idea you proposed. However, My other issue is that I've also tried using both extension and date as primary keys and it was still overwriting data.

 

Also, even if we were to use views, I still need to solve the issue with the query so that it updates through the day but then inserts new files at the beginning of every day. I was just instructed to create a table in order to do so.



#10 Jacques1

Jacques1
  • Members
  • PipPipPip
  • Turtles all the way down
  • 4,224 posts

Posted 09 August 2017 - 07:08 PM

However, My other issue is that I've also tried using both extension and date as primary keys and it was still overwriting data.

 

Then you still have other UNIQUE keys which shouldn't be there.

 

MySQL doesn't just randomly overwrite data. If a record is overwritten, that's because one of your keys says it's a duplicate.

 

 

 

Also, even if we were to use views, I still need to solve the issue with the query so that it updates through the day but then inserts new files at the beginning of every day.

 

No, because a view doesn't require any updates. It simply aggregates the data. The result can be cached by the query cache as long as the data doesn't change, so the calculation doesn't have to be repeated all the time.

 

Frankly, it doesn't look like anybody in your team has actually though this through. A table is the worst option. It's the absolutely last resort when everything else has failed.


Edited by Jacques1, 09 August 2017 - 07:08 PM.


#11 thenorman138

thenorman138
  • Members
  • PipPipPip
  • Advanced Member
  • 48 posts

Posted 09 August 2017 - 07:23 PM

So I can essentially just take the query I have and change it enough to make a view from the select/join statements and then set it to cache by the query cache? So every day it would update and cache and then start over the next day. I'm just trying to clarify because once this is solved I'll have to have this write out to JSON to upload to an API.



#12 kicken

kicken
  • Gurus
  • Wiser? Not exactly.
  • 3,345 posts
  • LocationBonita, FL

Posted 09 August 2017 - 07:44 PM   Best Answer

You'd write a query that gives you the results you want and use grouping to get it aggregated by day or whatever you need. Then to make it easier to manage/use you can create a view using the query.

It sounds like your current query gives you what you want so you'd just turn that into a view, probably without the date filter in the WHERE clause though. Eg:
CREATE VIEW daily_report
SELECT 
    c.extension as Extension
    , RESPONSIBLEUSEREXTENSIONID as ExtID
    , sum(Duration) as Total_Talk_Time_seconds
    , round(sum(Duration) / 60,2) as Total_Talk_Time_minutes
    , sum(if(LEGTYPE1 = 1,1,0)) as Total_Outbound   
    , sum(if(LEGTYPE1 = 2,1,0)) as Total_Inboundambition_test
    , sum(if(Answered = 1,0,1)) as Missed_Calls
    , count(DISTINCT b.NOTABLECALLID) as Total_Calls
    , DATE(b.ts) as reportDate
FROM cdrdb.session a
LEFT JOIN cdrdb.callsummary b
        ON a.NOTABLECALLID = b.NOTABLECALLID
LEFT join cdrdb.mxuser c
      ON a.RESPONSIBLEUSEREXTENSIONID = c.EXTENSIONID
WHERE 
    c.extension IN (7295,7306,7218,7247,7330,7000,7358)
GROUP BY c.extension
Then you use that to get your details for whatever day using something like
SELECT *
FROM daily_report
WHERE
    reportDate >= curdate()

Did I help you out? Feeling generous? I accept tips via Bitcoin @ 14mDxaob8Jgdg52scDbvf3uaeR61tB2yC7
Kicken's World⦄ ⦃Recycle old CD's

#13 thenorman138

thenorman138
  • Members
  • PipPipPip
  • Advanced Member
  • 48 posts

Posted 09 August 2017 - 07:47 PM

Thanks so much, this is a good marriage of everything I needed while still using the view as a better option over the table, like Jacques suggested. However, I should have explained, but I need to keep the where clause with curdate because that is using the timestamp from one of the tables to make sure it's only pulling call summaries for the present day. SO this ensures that my formulas are only using data from the call)summary table that reflects the current date.


Edited by thenorman138, 09 August 2017 - 07:49 PM.


#14 Barand

Barand
  • Moderators
  • Sen . ( ile || sei )
  • 17,974 posts

Posted 10 August 2017 - 05:18 PM

The àbove query has a WHERE condition on c table which has a left join.

Either
- this should either be an inner join or
- the condition should be in the join
If you are still using mysql_ functions, STOP! Use mysqli_ or PDO. The longer you leave it the more you will have to rewrite.

Donations gratefully received






moon.png

|baaGrid| easy data tables - and more
|baaChart| easy line, column and pie charts




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users