Jump to content
thenorman138

Using upsert with multiple unique index/columns

Recommended Posts

Insert into test.ambition_test(Extension, ExtID, Total_Talk_Time_seconds,
    Total_Talk_Time_minutes,Total_Outbound, Total_Inbound, 
    Missed_Calls, Total_Calls, Date_of_report, Time_of_report)  
SELECT 
     c.extension as Extension
    ,RESPONSIBLEUSEREXTENSIONID as ExtID
    , sum(Duration) as Total_Talk_Time_seconds
    , round(sum(Duration) / 60,2) as Total_Talk_Time_minutes
    , sum(if(LEGTYPE1 = 1,1,0)) as Total_Outbound   
    , sum(if(LEGTYPE1 = 2,1,0)) as Total_Inboundambition_test
    , sum(if(Answered = 1,0,1)) as Missed_Calls
    , count(DISTINCT b.NOTABLECALLID) as Total_Calls
    , DATE(NOW()) 
    , NOW()
    FROM cdrdb.session a
    LEFT JOIN cdrdb.callsummary b
            ON a.NOTABLECALLID = b.NOTABLECALLID
   LEFT join cdrdb.mxuser c
          ON a.RESPONSIBLEUSEREXTENSIONID = c.EXTENSIONID
   WHERE b.ts >= curdate()
   AND c.extension IN (7295,7306,7218,7247,7330,7000,7358)
    group by c.extension
ON duplicate key update Total_Talk_Time_seconds =values(Total_Talk_Time_seconds), 
    Total_Talk_Time_minutes =values(Total_Talk_Time_minutes), 
    Total_Outbound = values(Total_Outbound), Total_Inbound = values(Total_Inbound), 
    Missed_calls = values(Missed_Calls), 
    Total_Calls = values(Total_Calls), 
    Date_of_report = values(Date_of_report),
    Time_of_report = values(Time_of_report);  

The above query is what I have running in MySQL workbench currently. It's pulling and joining several tables to get call data and form metrics for 7 CSR agents on our phone system. IT pulls the data and forms the metrics correctly and it also updates throughout the day properly, but I have a huge problem: It will not insert new records on each new day, it still overwrites per day whereas I want to keep a history. 

 

I created a unique index on the Primary key Extension as well as the Date_of_report column as well.

 

To specify, this query runs every 15 minutes to add/aggregate the metric totals based on the extension, however, the next day we should be inserting 7 new records and then have those aggregate through the day. Basically by the end of 3 days, we would have 21 records. However, this just continues to overwrite every day.

 

Any ideas or tips here?

Share this post


Link to post
Share on other sites

You don't need more constraints - you have too many already, because MySQL is choosing the wrong duplicate row to update.

 

Make sure the query is only possibly creating records that would violate the date_of_report constraint. Look at what you have that is enforced unique and decide how the data fits into it.

Share this post


Link to post
Share on other sites

That's where I'm slightly confused. In my limited time with SQL, this is the first time I've had to use an upsert with multiple index columns. I have to treat extension as the PK and unique for duplication, but I also need to use the date so that each day will start a new batch of records and the previous day will be left with the End of day totals. What do you suggest I do differently with the constraints?

Share this post


Link to post
Share on other sites

Either

a) The extension is unique, in which case the query should be updating existing rows with a given value regardless of the date_of_report, or

b) The extension is not unique, in which case it shouldn't be the primary key (at least not by itself)

 

What are all the indexes on the table?

Share this post


Link to post
Share on other sites

So the only indexes on the table are PRIMARY for 'Extension' and UNIQUE for 'Extension' and 'Date_of_report'. I've been using only extension as the primary key for the agents, but I need date to dictate when the new records are inserted (by date).

Share this post


Link to post
Share on other sites

I don't think you understand what a primary key is.

 

The primary key is the unique identifier for the rows. If an extension can appear multiple times (on different dates), then it cannot be the primary key by definition. It may be important for various lookups, but it's definitely not a row identifier.

 

The whole table doesn't make a lot of sense. When this is all just aggregated data from existing tables, why even have this extra table? Is it supposed to be some kind of cache to solve performance problems?

Share this post


Link to post
Share on other sites

We are using a phone system for our CSR agents and the phone system database is all over the place so we found the data we needed from it by joining certain tables and using data from each table to put in the formulas, that way we get 6 columns of our most important metrics. This runs every 15 minutes to give us an updated log of those metrics that we give to another team for incentive based score-keeping, so that's why we need it in a small, stand-alone table.

 

So, I see what you're saying about making an arbitrary auto incremented ID for the primary key, but then would I just make extension and date_of_report together a unique index?

Share this post


Link to post
Share on other sites

You don't need a table to aggregate data. There are views.

 

I see what you're saying about making an arbitrary auto incremented ID for the primary key

 

I didn't say that at all. I said that the “primary key” you've chosen isn't a primary key, which means you need a different one. Like the combination of the extension and the date (assuming this does in fact identify the rows).

Share this post


Link to post
Share on other sites

Sorry, didn't mean to put it that way. I know you didn't say that, but I meant I could do that and that would go along with the idea you proposed. However, My other issue is that I've also tried using both extension and date as primary keys and it was still overwriting data.

 

Also, even if we were to use views, I still need to solve the issue with the query so that it updates through the day but then inserts new files at the beginning of every day. I was just instructed to create a table in order to do so.

Share this post


Link to post
Share on other sites

However, My other issue is that I've also tried using both extension and date as primary keys and it was still overwriting data.

 

Then you still have other UNIQUE keys which shouldn't be there.

 

MySQL doesn't just randomly overwrite data. If a record is overwritten, that's because one of your keys says it's a duplicate.

 

 

 

Also, even if we were to use views, I still need to solve the issue with the query so that it updates through the day but then inserts new files at the beginning of every day.

 

No, because a view doesn't require any updates. It simply aggregates the data. The result can be cached by the query cache as long as the data doesn't change, so the calculation doesn't have to be repeated all the time.

 

Frankly, it doesn't look like anybody in your team has actually though this through. A table is the worst option. It's the absolutely last resort when everything else has failed.

Edited by Jacques1

Share this post


Link to post
Share on other sites

So I can essentially just take the query I have and change it enough to make a view from the select/join statements and then set it to cache by the query cache? So every day it would update and cache and then start over the next day. I'm just trying to clarify because once this is solved I'll have to have this write out to JSON to upload to an API.

Share this post


Link to post
Share on other sites

You'd write a query that gives you the results you want and use grouping to get it aggregated by day or whatever you need. Then to make it easier to manage/use you can create a view using the query.

 

It sounds like your current query gives you what you want so you'd just turn that into a view, probably without the date filter in the WHERE clause though. Eg:

CREATE VIEW daily_report
SELECT 
    c.extension as Extension
    , RESPONSIBLEUSEREXTENSIONID as ExtID
    , sum(Duration) as Total_Talk_Time_seconds
    , round(sum(Duration) / 60,2) as Total_Talk_Time_minutes
    , sum(if(LEGTYPE1 = 1,1,0)) as Total_Outbound   
    , sum(if(LEGTYPE1 = 2,1,0)) as Total_Inboundambition_test
    , sum(if(Answered = 1,0,1)) as Missed_Calls
    , count(DISTINCT b.NOTABLECALLID) as Total_Calls
    , DATE(b.ts) as reportDate
FROM cdrdb.session a
LEFT JOIN cdrdb.callsummary b
        ON a.NOTABLECALLID = b.NOTABLECALLID
LEFT join cdrdb.mxuser c
      ON a.RESPONSIBLEUSEREXTENSIONID = c.EXTENSIONID
WHERE 
    c.extension IN (7295,7306,7218,7247,7330,7000,7358)
GROUP BY c.extension
Then you use that to get your details for whatever day using something like

SELECT *
FROM daily_report
WHERE
    reportDate >= curdate()

Share this post


Link to post
Share on other sites

Thanks so much, this is a good marriage of everything I needed while still using the view as a better option over the table, like Jacques suggested. However, I should have explained, but I need to keep the where clause with curdate because that is using the timestamp from one of the tables to make sure it's only pulling call summaries for the present day. SO this ensures that my formulas are only using data from the call)summary table that reflects the current date.

Edited by thenorman138

Share this post


Link to post
Share on other sites

The àbove query has a WHERE condition on c table which has a left join.

 

Either

- this should either be an inner join or

- the condition should be in the join

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.