Jump to content

optimizing script with heavy select/update


thenorman138

Recommended Posts

I'm running the below script using PDO, and I've finally gotten all of the logic working the way I need it to, I believe.

However, it seems to be taking forever, mainly with updates. The logic is a bit redundant by nature, due to my constraints with the data structure but I feel like it could still be optimized a bit.

The idea is to look at invoices and gather a list of items compared to store fronts for each customer. The logic I have says "If this order for customer A has item X and there already exists a record for Item x with this customer, and the record is 'expired', insert this record. However, if the record exists and we are still within the expiration date, just update. IF no record exists, insert as well". It also does the inserts based on quantity or storefront count, depending on which is greater.

This works perfectly well, but now I have the issue of speed and extra redundancy.

If Customer A has 10 stores orders  10 of Item X today and it's their first order, I'll insert 10 records, one for each item per storefront. But if they order 10 more in 2 days, and the day after and the day after......etc. The script updates as it should but the way I have it, it's selecting so many records at that point to check for unexpired records to update. Part of the issue is that for every record,I have to run the select query to check for existing records. Otherwise, if I do an initial select and the records in question are back to back, it will just insert thinking that it doesn't exist.

There has to be a much cleaner, better and more efficient way to do this but I feel like I'm in over my head
 

    $detailStatCheck = "
            SELECT
                 invnoc as INVOICE,
                 fstatc as STATUS,
                 cstnoc AS DEALER,
                 framec AS FRAME,
                 covr1c AS COVER,
                 colr1c AS COLOR ,
                 extd2d AS SHIPDATE,
                 orqtyc AS QUANTITY,
                 case when p.stores is null then 1 else p.stores end as STORES
            FROM table1 g
            left join table2 p on g.cstnoc = p.cstno
            WHERE fstatc in ('S','O')
            AND date(substr(extd1d,1,4)||'-'||substr(EXTD1d,5,2)||'-'||substr(EXTD1d,7,2) ) between current_Date - 451 DAY AND current_Date
            order by date(substr(extd1d,1,4)||'-'||substr(EXTD1d,5,2)||'-'||substr(EXTD1d,7,2) ) asc, invnoc asc
        ";

        try {
            $detailCheck = $DB2conn->prepare($detailStatCheck);
            $detailRslt = $detailCheck->execute();
            $count2 = $detailCheck->fetch();
            print_r($count2);
        }catch(PDOException $ex) {
            echo "QUERY ONE FAILED!: " .$ex->getMessage();
        }

     
        //Create prepared INSERT statement
        $insertPlacement = "
            INSERT ignore INTO dealerplacetest (sku_id, group_id, dealer_id, start_date, expire_date, locations, /*order_num,*/ quantity)
            SELECT
                id,
                sku_group_id,
                :DEALER,
                date_add(convert(:SHIPDATE,date), interval 7 day) as start_date,
                date_add(convert(:SHIPDATE,date), interval 127 day) as expire_date,
                :STORES,
                -- :INVOICE,
                :QUANTITY  
            FROM skus s  
            WHERE  s.frame=:FRAME AND s.cover1=:COVER AND s.color1=:COLOR
        ";


        //perpare query to check for existing records that are expired
        $validCheck = "
            SELECT
                sku_id,
                dealer_id,
                expire_date
            FROM dealerplacetest p
                INNER JOIN skus s
                    ON p.sku_id = s.id
            WHERE p.dealer_id = :DEALER
                AND   s.frame = :FRAME
                AND   s.cover1 = :COVER
                AND   s.color1 = :COLOR
                AND   p.expire_date > date_add(convert(:SHIPDATE,date), interval 7 day)

        ";

        $updatePlacement = "
            UPDATE dealerplacetest d
            inner join skus s
            ON d.sku_id = s.id
            SET expire_date = DATE_ADD(DATE_FORMAT(CONVERT(:SHIPDATE, CHAR(20)), '%Y-%m-%d'),INTERVAL 127 DAY),
                quantity = :QUANTITY
            WHERE d.dealer_id = :DEALER
            AND s.frame = :FRAME
            AND s.cover1 = :COVER
            AND s.color1 = :COLOR

        ";

    $checkExistingValid = $MysqlConn->prepare($validCheck);
    $insert = $MysqlConn->prepare($insertPlacement);
    $update = $MysqlConn->prepare($updatePlacement);

        while ($row2 = $detailCheck->fetch(PDO::FETCH_ASSOC)) {

             $values = [
                ":DEALER" => $row2["DEALER"],
                ":SHIPDATE" => $row2["SHIPDATE"],
                ":STORES" => $row2["STORES"],
                ":QUANTITY" => $row2["QUANTITY"],
                ":INVOICE" => $row2["INVOICE"],
                ":FRAME" => $row2["FRAME"],
                ":COVER" => $row2["COVER"],
                ":COLOR" => $row2["COLOR"],
            ];

            $values2 = [
                ":DEALER" => $row2["DEALER"],
                ":FRAME" => $row2["FRAME"],
                ":COVER" => $row2["COVER"],
                ":COLOR" => $row2["COLOR"],
                ":SHIPDATE" => $row2["SHIPDATE"],
            ];

            $values3 = [
                ":SHIPDATE" => $row2["SHIPDATE"],
                ":QUANTITY" => $row2["QUANTITY"],
                ":DEALER" => $row2["DEALER"],
                ":FRAME" => $row2["FRAME"],
                ":COVER" => $row2["COVER"],
                ":COLOR" => $row2["COLOR"],
            ];

            try{

                $existingVldRslt = $checkExistingValid->execute($values2);
                $count4 = $checkExistingValid->fetch(PDO::FETCH_ASSOC);
                    
            }catch(PDOException $ex){
                echo "QUERY TWO FAILED!!!: " . $ex->getMessage();
            }
                
            print_r($count4);

            if(!empty($count4) /*&& empty($count3)*/){
                print_r("updating");
                
                    for($i=0; $i<$row2['QUANTITY']; $i++){
                        try{
                            $updateRslt = $update->execute($values3);
                        }catch(PDOException $ex){
                            echo "UPDATE_FAILED!!!: " . $ex->getMessage();
                        }
                    }
            }else{
                print_r("inserting");
                if($row2["QUANTITY"] >= $row2["STORES"]){
                    for($i=0; $i<$row2["STORES"]; $i++){    
                        try{
                            $insertRslt = $insert->execute($values);
                        }catch(PDOException $ex){
                            echo "INSERT_FAILED!!!: " . $ex->getMessage();
                        }
                    }
                }elseif($row2["QUANTITY"] < $row2["STORES"]){
                    for($i=0; $i<$row2["QUANTITY"]; $i++){    
                        try{
                            $insertRslt = $insert->execute($values);
                        }catch(PDOException $ex){
                            echo "INSERT_FAILED!!!: " . $ex->getMessage();
                        }
                    }
                }
            }
        }  
Link to comment
Share on other sites

Never run queries in loops whenever possible!

 

The logic is a little difficult to follow (USE COMMENTS!) but here is what I am seeing:

 

You run a SELECT query for 'detailCheck' and loop over those records to run a second SELECT query for 'checkExistingValid'.

 

Then, based on the results of that 2nd SELECT query you run a series of INSERT queries or UPDATE queries.  There is some logic within the INSERT block to do different inserts.

 

That is an awful lot of loops and queries. Plus, there are queries that are doing way more than they need to. For example, you ahve a query for 'checkExistingValid' that is a select statement that returns multiple fields for all records meeting specific criteria. But, then, you only use the results to see if anything was returned or not. You should just be doing a simple COUNT(*) query and verify if the result is 0 or more.

 

I think you can remove the vast majority of the loops, but I would have to understand the data structure a lot more than I am willing to invest time.

Link to comment
Share on other sites

That makes sense actually, thank you. I was returning the values in the query for the sake of testing and update values, but I actually don't need them anymore so that will help. As for the loops, that's where the very odd and obtuse logic comes in and that has to stay the way it is until we refactor everything, but the queries alone should save me some time.

Link to comment
Share on other sites

from a 'data keeping' standpoint, what you state you are doing doesn't make sense to me (and perhaps others?)

 

each order that gets placed is a separate occurrence. it should result in a row being inserted into an 'orders' table, with the unique information about the order, each item that is part of that order should cause a row to be inserted into an 'order_items' table, related back to the order it corresponds to through an order_id, from an auto-increment column in the 'orders' table, and if i recall correctly, you are inserting a row for each count of an item, to allow tracking of each individual piece that's part of an order.

 

no data should be UPDATEd unless you are correcting information, such as someone calling in after placing an order, but before it has been acted upon, and needs to change a quantity, or add or delete an item.

 

the above is the input data that you have available, for each order. you should not be trying to insert/update records based on this and dates. all reporting, tracking, ... should be based on the original input data.

 

so, the question i have is what exact reporting are you trying to accomplish?

Link to comment
Share on other sites

As for the loops, that's where the very odd and obtuse logic comes in and that has to stay the way it is until we refactor everything . . 

 

No, it doesn't have to stay that way. There are ways to accomplish the same results without all those loops. As I said, for me to provide a solution would take me more time than I am willing to give up for free.

Link to comment
Share on other sites

As with the others, your logic makes no sense to me and your data is too undefined to provide you with any actual code help. You'll have to do a much better job at defining what your data is and what your process is if you need code help. For example, your first query just has generic table names and many of your column names are not prefixed so I have no idea what data is in what table or why. Define all of that.

 

As for the loops, that's where the very odd and obtuse logic comes in and that has to stay the way it is until we refactor everything, but the queries alone should save me some time.

Looping isn't inherently bad, but running queries within each loop iteration is terrible. What you generally want to do is find a way to eliminate the queries, even if you still have to keep the loops.

 

For example, pull all your current unexpired records and store them into an array. Then as you loop through your new data rather than hit the database every iteration to find if a record exists you can simple reference that array. Checking if something exists in a local array is many times faster than hitting the database, and you only use a single query to load the data rather than one query per loop iteration.

 

Making use of temporary tables may provide you some benefit. For example, take your data from your DB2 instance and just load it into a mysql temporary table. You should then be able to join with that table in order to grab all your applicable unexpired rows in a single query. You could also join with that table to do your insert/update queries and reduce the amount of data you have to shuffle between php and mysql.

 

I'd imagine a process that ends up something like (in pseduo-code):

bulkTransferToMysqlTempTable();
$existingRecords = queryAllExistingRecords();

$detailResults = queryOrdersToCheck();
foreach ($detailResults as $order){
    if (orderExists($order, $existingRecords)){
        updateOrder($order);
    } else {
        insertOrder($order);
    }
}

function bulkTransferToMysqlTempTable(){
    //CREATE TEMPORARY TABLE ...

    //Query data from $DB2Conn

    //Multi-row INSERT INTO temporary table
}

function queryOrdersToCheck(){
    //Query temporary table for data to be checked and inserted/updated
}

function queryAllExistingRecords(){
    //Get a count of total unexpired records grouped by the fields you were comparing.
    //Join with the temporary table to check find relevant rows and check expiration status
    $sql = 'SELECT p.dealer_id, s.frame, s.cover1, s.color1 FROM ... GROUP BY p.dealer_id, s.frame, s.cover1, s.color1';
    $results = query($sql);
    $existing = [];
    foreach ($results as $row){
        $key = implode('-', [
            $row['dealer_id']
            , $row['s.frame']
            , $row['s.cover1']
            , $row['s.color1']
        ]);

        $existing[$key] = true;
    }

    return $existing;
}

function orderExists($order, &$existingRecords){
    $key = implode('-', [
        $order['DEALER']
        , $order['FRAME']
        , $order['COVER']
        , $order['COLOR']
    ]);

    return isset($existingRecords[$key]);
}

function updateOrder($order){
    //...
}

function insertOrder($order){
    //...
}

There may be further ways to optimize things, but without a better understanding of your data and process I can't say for sure. For example, maybe you can eliminate the 'does it exist?' query entirely by using the temporary table? Perhaps you could delay your inserts until you've processed everything and use a single multi-row insert query?
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.