Jump to content
Sign in to follow this  
vpetkovic

PHP Application Performance (Smart Caching) Advise

Recommended Posts

Good Day all,

 

I am developing personal fitness app using PHP and it came to my intention that I might need to integrate some sort of smart caching in order to possibly increase application performance and specific page load time!
 

Let me brief you with info on the page I need your opinion and advice on:

 

Once logged in, full page calendar is displayed and it is user specific workouts that are displayed in the calendar.

Calendar itself is created by js and it is loading extremely fast on its own. It displays items into it from an array from help js file (data,js)
At first I had each user have its own data.js file but I have switched to one dynamic js/php file (data.php) that retreives items from MySQL db and dynamically populate appropriate data for that specific user. It works and it loads fast.

 

Personally I use that app every day for my own purposes (before I try to have more users using it) and what I noticed is that page load decreased by having more items in calendar (obviously) and with with more users using the app in the same time server load will increase a lot - so before everything I would like to have that data.php optimized as much as possible to improve application performance in a long run and to prepare it for higher load.

What I've tried so far and what are my ideas on how to achieve what I want:

 

I tried integrating following cache and it decreased load time by a lot by serving cache file which expired every x seconds/minutes/hours but I ran into one simple issue that I found solution for but not sure it that's ideal or there's some other better and smarter ways to do it.

Problem:
When user tries to add item to the calendar (and ajax is responsible for it) changes are not displayed until cache is expired but I want to have them see the changes right away and not to wait for cache to expire (because they won't even know of it).
To short cache expiration time - what's the point of cache in this case, right?!

To long cache expiration time - not the point to have user to wait or thing it doesn't work when they add workouts into calendar.

 

Possibly the best solution:

I thought to have each user to have it's own cache folder where they will store cached files of data.php (plus in the future I can integrate to store cache for some other pages) and then to modify script (that add things into a calendar) to delete record from db where cache file name is stored and to automatically delete cache file from actual cache folder - that way it will force new cache file to be created (because it doesn't exist any more) and record of that cache file name inserted into a database.

Simply put it will create new cache file each time new workout is added to calendar and user will be able to see changes right away but if nothing added to the calendar cache file will be served and server load and numbers of queries sent to db will be drastically decreased.

Question for you:

What do you think about my problem and approach I came up with for solving app performance issue? Can you think of some better, smarter and more elegant solution/approach? Do you have any other advice or suggestions? Any help wil be appreciated! Thanks in advance!
 

 

cache_footer.php

cache_header.php

Share this post


Link to post
Share on other sites
what I noticed is that page load decreased by having more items in calendar (obviously) and with with more users using the app in the same time server load will increase a lot - so before everything I would like to have that data.php optimized as much as possible to improve application performance in a long run and to prepare it for higher load.

 

 

in order to fix a problem, you must first find the cause of the problem. if you had a noticeable change in the page generation/loading time as the number of data items increased, your code is doing something extremely inefficient and it may be that caching cannot be used due to the nature of the data or won't have any effect on the problem. if you want someone here to help find the cause of the problem and make suggestions that would help, you will need to post enough of your code and some sample data that demonstrates and reproduces the problem.

Share this post


Link to post
Share on other sites

You are absolutelly right. What I didn't mentions is that my account has let's say aproximately 300 records since january 1st and when I log in into it... it queries db and dinamically echo data to data.php (which is an array) so it goes like this:

var codropsEvents = {
'date': 'link',

'date': 'link',

'date': 'link',

'date': 'link',

}

So ultimately if I have 300 records to echo ... I am sure that in a year or two i'll have 2-3 times more to display (since I want to be able to go back and see what i've done last year or so...)
So load time for file 20 lines long and for 300 lines long is gonna be different i know that and aware of that (difference is not huge - maybe like 300-400 ms). Test I performed with querying db each time I laod the page and loading cached file is pretty different. Example: quering db and loading the page comepletely takes 2.5s and loading cached file takes 0.65s.

Appart from recoding the php file (data.php - attached) that pulls records from db (I know it's terrible) what are other suggestions on load time and performance for php apps and situations like mine

 

data.php

Share this post


Link to post
Share on other sites

the four biggest problems with the code are -

 

1) you have stored same meaning data in two separate tables, one for members and one for clients, which you are querying for but not using.

 

2) you are retrieving all the data, not just the data being displayed. you are outputting all the data to the client for it to handle.

 

3) you are running multiple queries inside of a loop.

 

4) some of your data is not stored correctly, requiring you to do a substr_replace() on it every time you use it.

 

items #2 and #3 are the cause of your performance problem. you should only retrieve the data you are currently displaying and i'm betting that all those queries can be replaced with a single query.

  • Like 1

Share this post


Link to post
Share on other sites

So ultimately if I have 300 records to echo ... I am sure that in a year or two i'll have 2-3 times more to display (since I want to be able to go back and see what i've done last year or so...)

So, that's your first and biggest problem right there. You shouldn't be loading every single record from the database; you need to use pagination. You load X many initially, X being whatever is a good number for user experience, and then load more as needed.

 

Secondly, when caching dynamic data the common thing to do is to delete the cache when changes are made. If there are no changes, the cache can exist for a very long time since you're just reading it. When a change is made you just delete the cache, and re-create the cache the next time it is read. If changes are very frequent then you might have to tune this a little more.

 

In your data.php, you have the same query running twice. Why?

$query_data_members = "SELECT DISTINCT dt_id, dt_date, dt_data, dt_bodypart, dt_wktype, dt_wkurl, COUNT(*) as Total FROM data WHERE dt_username_member = '$username' GROUP BY dt_date ORDER BY dt_date";
$query_data_clients = "SELECT DISTINCT dt_id, dt_date, dt_data, dt_bodypart, dt_wktype, dt_wkurl, COUNT(*) as Total FROM data WHERE dt_username_member = '$username' GROUP BY dt_date ORDER BY dt_date";
You also have SELECT queries within loops. BAD! Use proper JOINs instead and keep queries out of loops. This is your second biggest problem.

Share this post


Link to post
Share on other sites

First to thank both fo you for taking time to take a look at my case and to give me some very useful suggestions and I can't wait to implement suggestions. I created this query at very beginning of the app development with really not to much knowledge and I just wanted to make it to work but I ahve to rething my app and redo my code through out the app. Also could you be able to review my code once I modify it in that fashion to elimante all the problems you listed below?

 

the four biggest problems with the code are -

 

1) you have stored same meaning data in two separate tables, one for members and one for clients, which you are querying for but not using.

 

2) you are retrieving all the data, not just the data being displayed. you are outputting all the data to the client for it to handle.

 

3) you are running multiple queries inside of a loop.

 

4) some of your data is not stored correctly, requiring you to do a substr_replace() on it every time you use it.

 

items #2 and #3 are the cause of your performance problem. you should only retrieve the data you are currently displaying and i'm betting that all those queries can be replaced with a single query.

 

Regarding data not stored correctly I would say that data is stored correclty and in my case it applies only to links which looks like this: domain.com/workout.php?id=1 for instance and what I am doing is just rewriting it to look like this domain.com/workout-1 so it's more user firiendly. i hope this is not bad to do? do you know better way of doing it?

 

 

So, that's your first and biggest problem right there. You shouldn't be loading every single record from the database; you need to use pagination. You load X many initially, X being whatever is a good number for user experience, and then load more as needed.

Secondly, when caching dynamic data the common thing to do is to delete the cache when changes are made. If there are no changes, the cache can exist for a very long time since you're just reading it. When a change is made you just delete the cache, and re-create the cache the next time it is read. If changes are very frequent then you might have to tune this a little more.

In your data.php, you have the same query running twice. Why?

$query_data_members = "SELECT DISTINCT dt_id, dt_date, dt_data, dt_bodypart, dt_wktype, dt_wkurl, COUNT(*) as Total FROM data WHERE dt_username_member = '$username' GROUP BY dt_date ORDER BY dt_date";
$query_data_clients = "SELECT DISTINCT dt_id, dt_date, dt_data, dt_bodypart, dt_wktype, dt_wkurl, COUNT(*) as Total FROM data WHERE dt_username_member = '$username' GROUP BY dt_date ORDER BY dt_date";
You also have SELECT queries within loops. BAD! Use proper JOINs instead and keep queries out of loops. This is your second biggest problem.

 

 

Well I see what your are saying. It's not clasic blog site to easy implement pagination but I understand the point why you suggested it. Calendar is jsut reading the data from an array from data.php and if oyu have suggestion on how to do it I am happy to hear it. I will work towards finding better solution for my calendar but this was far easiest for me to implement when I first started.

Also those two queries are because I have both members and clients tables and two tables are different and store different data but I am already working on merging those two as I realize more and more I don't need it separated like I needed (at least thought so) before.

regarding queries in loop - can you write me quick example on how would I achieve the same result but having query outside of loop as I did it that way cause I had no idea how to else to do it?

Thanks again.

Share this post


Link to post
Share on other sites

Sorry, I gave an answer on wrong question - no need for duplicate queries - my bad didn't even realize it. THANKS!

Share this post


Link to post
Share on other sites

Well I see what your are saying. It's not clasic blog site to easy implement pagination but I understand the point why you suggested it. Calendar is jsut reading the data from an array from data.php

But if you're reading 300+ records from the database, you have a huge bottleneck. Pagination is not just limited to actual "pages", like in a blog - they are also very relevant in API's, which is basically what you have here. You need to limit how much data you are pulling from the database and sending through the server. People are never going to need to see their entire calendars worth of data on one go, so pulling all of that data is just wasting resources.

 

How it works is pretty much the same as how it works in any other case. You'd pull, say, 20 records from the database, and then some meta data on total number of records, which page you're on, total pages, etc. Then your AJAX would just pass in a page number to get the next set of data.

 

Also those two queries are because I have both members and clients tables and two tables are different and store different data but I am already working on merging those two as I realize more and more I don't need it separated like I needed (at least thought so) before.

But, those two queries are literally identical in every way; unless I'm just really blind.

 

regarding queries in loop - can you write me quick example on how would I achieve the same result but having query outside of loop as I did it that way cause I had no idea how to else to do it?

It would help if you posted your database schema. If you have to do a SELECT inside of a loop, then your data is not modeled correctly.

Share this post


Link to post
Share on other sites

I'm just going to throw out, that there are 2 common solutions to the caching of relational data. MemcacheD and Redis. These are both daemons that you run, and require some memory allocation. They are close to RAM speed for fetching data in most cases, and serve to buffer the database from SELECT/READ activity, so they are widely used for performance and scalability concerns.

 

I'm not clear from your description what the nature of the queries are, but let's assume that they are individualized to each user.

 

When you create your cache buckets, you'll name them in a way that they'll be locatable again.

 

I know this is very confusing for people, but in most cases, you have to roll your own solution for cache integration.

 

The basic prescription is this:

 

-You query

-Code checks for the existence of an existing cache entry

--If it exists (return data from cache)

--If not, query, store data to cache with TTL (similar to what you have now)

----return data from cache

 

 

A simple bucket name might be: '/calender/user/{member_id}';

 

 

For your insert/update/Delete code for the calendar, you simply need to make a delete cache entry that uses the member_id variable to locate the cache entry, and delete it, should it exist.

Share this post


Link to post
Share on other sites

Since you have a calendar type app...

I suggest you make this a REST api or something similar

output this in json

do users sessions

pagination/limit fetches data by date and per user, the selection date in the calendar would set this GET request and return the data, default could be todays/now date

 

optional cache the json responses:

the new method would be faster and less usage

do not see caching the data individual dates and single users being a performance gain

your app may work better using the live data

 

To sum it up is a json response using their session id,date and returns limited data.

If you have any type security such as a password, can use a public key provided to them instead.

 

Another post using dates as GET parameters showed to another member

http://forums.phpfreaks.com/topic/296327-same-file-for-multiple-pages/?view=findpost&p=1511965

Share this post


Link to post
Share on other sites

Thanks @QuickOldCar, @gizmola, @scootstah for your replies, you really openned my eyes for things I aksed and didn't ask and I will be recoding the page and once done I'll show you what I got. Important part is that I understand the point you made and what I need to do and pretty confident I can do it right! Thanks a lot again. :)

Share this post


Link to post
Share on other sites

But if you're reading 300+ records from the database, you have a huge bottleneck. Pagination is not just limited to actual "pages", like in a blog - they are also very relevant in API's, which is basically what you have here. You need to limit how much data you are pulling from the database and sending through the server. People are never going to need to see their entire calendars worth of data on one go, so pulling all of that data is just wasting resources.

 

How it works is pretty much the same as how it works in any other case. You'd pull, say, 20 records from the database, and then some meta data on total number of records, which page you're on, total pages, etc. Then your AJAX would just pass in a page number to get the next set of data.

 

 

But, those two queries are literally identical in every way; unless I'm just really blind.

 

 

It would help if you posted your database schema. If you have to do a SELECT inside of a loop, then your data is not modeled correctly.

 

I have sent you an email as you said you are would be able to help me out. Thanks!

Share this post


Link to post
Share on other sites

Let's keep the discussion in here, for the benefit of the community.

 

As I said before, please post your database schema, so that we can tell you if it is properly modeled to allow for JOINs and such, and also for reference. Run this query for each of your tables and post back with results:

SHOW CREATE TABLE `data`;
Replace data with each of your table names.

Share this post


Link to post
Share on other sites

Sounds good!
 

Only one table is involved:

CREATE TABLE `data` (
  `dt_id` int(11) NOT NULL AUTO_INCREMENT,
  `dt_username_member` varchar(30) DEFAULT NULL,
  `dt_date` date DEFAULT NULL,
  `dt_data` varchar(1000) NOT NULL,
  `dt_bodypart` varchar(100) DEFAULT NULL,
  `dt_wktype` varchar(100) DEFAULT NULL,
  `dt_custom_event` varchar(10000) NOT NULL,
  `dt_wkurl` varchar(1000) DEFAULT NULL,
  `dt_nwpid` varchar(100) NOT NULL DEFAULT '',
  `dt_year` year(4) NOT NULL,
  PRIMARY KEY (`dt_id`),
  KEY `members_clients` (`dt_username_member`)
) ENGINE=InnoDB AUTO_INCREMENT=11060 DEFAULT CHARSET=utf8;

Query I sent you in email:

(SELECT dt_id, DATE_FORMAT(dt_date, '%m-%d-%Y') as dt_date, dt_data, dt_bodypart, dt_wktype, dt_wkurl, dt_year, COUNT(*) as Total FROM data WHERE dt_year = '2015' and dt_username_member = 'admin' GROUP BY dt_date ORDER BY dt_date DESC LIMIT 0,100) ORDER BY dt_date ASC

Calendar cannot display more than one result for the specific date if date appears more than once in an array just like this:

array {
'08-26-2015' : 'something',
'08-26-2015' : 'something else',
'08-27-2015' : 'else',
}

and that's why I am grouping them and running query in while loop to obtain all rows for dates that are grouped into one date and would like to achieve following:

array {
'08-26-2015' : 'something, something else',
'08-27-2015' : 'else',
}

Let me know if you need any other info. Thanks!

Edited by vpetkovic

Share this post


Link to post
Share on other sites

Okay, so you're definitely going to want to break that into multiple tables. You're using a relational database, not a spreadsheet.

 

I'm not sure that I have enough information on your business rules or data needs at this point to really make an appropriate model. The process of database normalization is to remove duplicated or repetitive data from your table and move it to another table, and then reference it with a foreign key constraint. For example the column "dt_bodypart" - I imagine that your rows will probably have duplicate data here... "arm", "arm", "arm", "leg", "leg", yeah? If that's the case, you could have a "bodypart" table where a row is inserted for each body part, and then you reference it via ID in your other table, and you then JOIN it to get the result.

 

I've made an example here: http://sqlfiddle.com/#!9/8e10e/1

 

Some other columns that could probably be normalized:

- dt_username_member

- dt_wktype

- dt_custom_event

 

What is dt_data? What kind of "data" does that hold?

 

It could also help if you provided a small dump with like ~20 rows of real data, so that we can see what you're actually storing for values.

Share this post


Link to post
Share on other sites

OK i think that I should've already included data dump for you to get an idea what data you can find in 'data' tablet. Here it is:

 

dqru2Lx.png

Edited by vpetkovic

Share this post


Link to post
Share on other sites

Okay, so sounds like "dt_data" is really the exercise, so that can be its own table as well. What's up with the dt_wkurl, what is the point of that?

Share this post


Link to post
Share on other sites

dt_data represents workout done for that day (it's user's input  and points to the actual workout)... dt_wkurl is the url of the workout that user worked out that day. Take a look at the screenshot below.

 

tirVzwW.png

 

So take a look at 26th or 24th... I have 2 records in database with the same date but dt_data are different and dt_wktype etc. So in order to display it like that in the calendar from the screenshot I have to generate data.php as an array I showed you in previous posts. Does this make more sense now?

Based on all info how would you solve the problem of running second query inside of a while loop to grab all the rows for day that has more workouts?

Edited by vpetkovic

Share this post


Link to post
Share on other sites

dt_data represents workout done for that day (it's user's input  and points to the actual workout)...

Okay, so why would the user want to type in their workout every day? Wouldn't it be better to select from a list? They could also create their own, which would be saved to its own table unique to that user, where they could then select from a list the next time.

 

dt_wkurl is the url of the workout that user worked out that day.

So where does that ID come from? Why not just store the ID and build the URL when you display the link? If you ever change your URL you're going to have to do a huge search & replace operation in the database.

 

 

I think this might be a good start:

member
---------
id | name
---------
1  | bob


bodypart
-------------
id | bodypart
-------------
1  | arm
2  | leg
3  | chest


workout_type
-----------------
id | workout_type
-----------------
1  | strength
2  | volume
3  | dayoff


activity
-------------
id | activity
-------------
1  | Lower Body
2  | Push
3  | Pull


workout
-------------------------------------------------------------------------
id | date       | workout_type_id | member_id | bodypart_id | activity_id
-------------------------------------------------------------------------
1  | 09-01-2015 | 1               | 1         | 1           | 1
To get multiple results for the same day, I would probably fetch all of the data and then format it based on the day. So that you end up with a structure like this:

array(
    '09-01-2015' => array(
        array('id' => 1, 'workout_type_id' => 1, 'bodypart_id' => 1, 'activity_id' => 1),
        array('id' => 2, 'workout_type_id' => 1, 'bodypart_id' => 2, 'activity_id' => 2),
    ),
    '09-02-2015' => array(
        array('id' => 3, 'workout_type_id' => 1, 'bodypart_id' => 1, 'activity_id' => 1),
    ),
)
Now you can easily access this information when building your calendar. Edited by scootstah

Share this post


Link to post
Share on other sites

Okay, so why would the user want to type in their workout every day? Wouldn't it be better to select from a list? They could also create their own, which would be saved to its own table unique to that user, where they could then select from a list the next time.

 

Well, I gave the freedom to user to choose what it will be displayed as the workout name. I have the list with workouts saved by user or by trainer so he can choose from the list whichever and then workout name field gets automatically filled in but if user just like myself want something different to be displayed then why not. I personally when crete workouts i named them differently not just Back or Arms or Arms 1 I name them something lke 'blirzgreig' or 'Armagetsome' to easliy remind myself of the workout without opening it up becuase I get confused if I have Arms1 to Arms10 which one is actually Arms6 and what is consisted of. So when I note that I did blitzgreg workout for certain day I like to know what body part I worked out that day instead of workout name just liek you saw on the screenshot.

 

This topic is my personal decision and doesn't have much to do with my issue here :)

 

So where does that ID come from? Why not just store the ID and build the URL when you display the link? If you ever change your URL you're going to have to do a huge search & replace operation in the database.

 

 

This is awesome point. idk why I didn't think of this before! Ill jump right on it and correct that!

 

 

I got understand the db scheme that you suggest and make sense but code wise how would you output those arrays. I can think of the way to grab all data but how would you code wise output all dates and if there are more workouts for that day output those as well. Code wise how to get something liek this:

 

array(
    '09-01-2015' => array(
        array('id' => 1, 'workout_type_id' => 1, 'bodypart_id' => 1, 'activity_id' => 1),
        array('id' => 2, 'workout_type_id' => 1, 'bodypart_id' => 2, 'activity_id' => 2),
    ),
    '09-02-2015' => array(
        array('id' => 3, 'workout_type_id' => 1, 'bodypart_id' => 1, 'activity_id' => 1),
    ),
)

I am not to much skilled with PHP and it seems you are so I would like to learn. Thanks a lot for all the time you put into finding the best solution for my app to boost performance and have better code.

Share this post


Link to post
Share on other sites

This topic is my personal decision and doesn't have much to do with my issue here :)

Absolutely, but it does matter from a database design perspective.

 

 

I got understand the db scheme that you suggest and make sense but code wise how would you output those arrays. I can think of the way to grab all data but how would you code wise output all dates and if there are more workouts for that day output those as well. Code wise how to get something liek this:

Well, when you build your calendar you would just see if any array items existed for that date. In pseudo code it'd look something like:

 

workouts = array(
    '09-01-2015' => array(
        array('id' => 1, 'workout_type_id' => 1, 'bodypart_id' => 1, 'activity_id' => 1),
        array('id' => 2, 'workout_type_id' => 1, 'bodypart_id' => 2, 'activity_id' => 2),
    ),
    '09-02-2015' => array(
        array('id' => 3, 'workout_type_id' => 1, 'bodypart_id' => 1, 'activity_id' => 1),
    ),
)

for each day
    month = 09
    day = 01
    year = 2015

    if exists workouts[month-day-year]
        for each workouts[month-day-year]
            print workout

Does that make sense? If not I'll give better example.

Share this post


Link to post
Share on other sites

Absolutelly! Thanks a lot! I will take some time to adjust db for the solution you suggested and will be implementing in second application that I have on mind for my personal use that is also based on the calendar just like this one!

in meantime while I was waiting on your reply I came up with another query where I'll be grabbing all data in one query that can be used with my current setup and for the curtesy of learning more about mysql and php i'll build 3rd test enviroment to test all 3 solutions to see which one will take least time to complete the task. I read all over the web that there's no unique approach to every situation and for some one is better than the other depending on the nature of the app. So I want to get better understanding of that.

Maybe you have experiance in testing laod time and give me some advice on that as well.

1. Query itself takes 2.5ms to retreive all records (~300) for one user (without any limits) and 0.4ms to retrieve only 30 records.

    a. But the query DIDN'T retreiv all the data and query inside the loop increases load time for sure which I'll measure with php later.

2. Query (new one that I came up while I was waiting on your latest reply - see it below) takes 3.2ms to retreive all the records for particular user

    a. But this query DID retreive all data and querying inside while loop is not required inside while loop and despite of longer load time of main query it could be faster since it doesn't need to run any other query where curent setup has to.

(SELECT DATE_FORMAT(dt_date, '%m-%d-%Y') dt_date, 
count(dt_date) as date_count,
GROUP_CONCAT(dt_data ORDER by dt_date DESC) dt_data, 
GROUP_CONCAT(dt_bodypart ORDER by dt_date DESC) dt_bodypart,
GROUP_CONCAT(dt_wktype ORDER by dt_date DESC) dt_wktype,
GROUP_CONCAT(dt_wkurl ORDER by dt_date DESC) dt_wkurl
FROM data 
WHERE dt_username_member = 'admin' AND dt_year = '2015' 
GROUP BY dt_date ORDER BY dt_date DESC) ORDER BY dt_date ASC

If you have any opinion on this I would gladly hear it :)
 

Edited by vpetkovic

Share this post


Link to post
Share on other sites

Thanks for those useful links!

 

dt_username_member and dt_year are indexed now

 

I am working on ajdusting app to use pagination. One thing I noticed with above query is that load time is very similar with or without the limit: 3.0ms(30 records) vs 3.2ms(all records).

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.