arthaeus Posted January 24, 2008 Share Posted January 24, 2008 just a little recap: i have a table who's structure is as follows: CREATE TABLE `elgg_profile_data` ( `ident` int(10) unsigned NOT NULL auto_increment, `owner` int(10) unsigned NOT NULL default '0' COMMENT '-> users.ident', `access` varchar(20) collate utf8_unicode_ci NOT NULL default 'PUBLIC', `name` varchar(255) collate utf8_unicode_ci NOT NULL default '', `value` text collate utf8_unicode_ci NOT NULL, PRIMARY KEY (`ident`), KEY `owner` (`owner`,`access`,`name`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=19001 ; as you can probably tell, each attribute about a user is stored on a separate row ie if a user has a firstname, lastname, and middle name properties, they will have 3 corresponding rows in the database with the row.name = firstname, lastname and middle name. and it's corresponding row.value = mike jones john. here's the query i developed to search for a user. select * from elgg_users eu where ident in ( SELECT owner FROM elgg_profile_data WHERE name='firstname' and value = 'mike' ) and eu.ident in ( SELECT owner FROM elgg_profile_data WHERE name='lastname' and value = 'jones' ) and eu.ident in ( SELECT owner FROM elgg_profile_data WHERE name='middlename' and value = 'john' ) now i pumped this database with over 1 million rows of dummy data, and this query took 9 seconds. is this unacceptable? not looking for coding help, just a "it's a go", or "its a no go" (and why!) thanks so much for everyone who has already helped Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/ Share on other sites More sharing options...
arthaeus Posted January 24, 2008 Author Share Posted January 24, 2008 One more thing.... here is the query, right: select * from elgg_users eu where ident in ( SELECT owner FROM elgg_profile_data WHERE name='firstname' and value = 'mike' ) and eu.ident in ( SELECT owner FROM elgg_profile_data WHERE name='lastname' and value = 'jones' ) and eu.ident in ( SELECT owner FROM elgg_profile_data WHERE name='middlename' and value = 'john' ) now this is only for a 3 attribute search. i can build a bigger query by just adding more: and eu.ident in ( SELECT owner FROM elgg_profile_data WHERE name='middlename' and value = 'john' ) clauses. will this be efficient for a query that could be as specific as 10 elements? Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-447938 Share on other sites More sharing options...
dbo Posted January 24, 2008 Share Posted January 24, 2008 It depends on what you're doing... if you're doing some reporting 9 seconds probably isn't too big of a deal. If you're doing real time transaction processing for a mainstream application... then no 9 seconds won't work. All that being said, given your data model I think the reality is this is about all you have to work with. It's going to be more efficient to process this stuff on the database server than to pull back a high level query and then sort through it in code, but subqueries are definitely expensive. Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-447967 Share on other sites More sharing options...
arthaeus Posted January 24, 2008 Author Share Posted January 24, 2008 It depends on what you're doing... if you're doing some reporting 9 seconds probably isn't too big of a deal. If you're doing real time transaction processing for a mainstream application... then no 9 seconds won't work. All that being said, given your data model I think the reality is this is about all you have to work with. It's going to be more efficient to process this stuff on the database server than to pull back a high level query and then sort through it in code, but subqueries are definitely expensive. hey dbo. What is a good response time? when you talk about doing the work on the server are you talking about using stored procedures? just one more quick question for you too, would it be quicker if i created another table based off my profile_data table so that one row would hold all of that user's attributes? for example firstname middlename lastname age foo bar key --------------- --------------------- ----------------- --------- -------- -------- -------- mike john jones 30 bar foo 001 then i wouldnt need subqueries. it would be select key where foo='bar' and age='30' and middlename='john'...... etc. what are your thoughts on making a seperate table for queries Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-447994 Share on other sites More sharing options...
dbo Posted January 24, 2008 Share Posted January 24, 2008 When I said doing work on the server I really meant the database server... not necessarily using stored procedures so much as just running the query entirely on the database. The alternative is obviously to run some query that gets the data that you want... then having to write PHP code to further refine your search. It's just more efficient to let the DB server do the work if you can. Stored procedures don't buy you a ton... they really just save you the overhead of sending the query across the wire... since were dealing with a small amount of text it doesn't buy you much. Your proposed model is obviously easy to query, the risk you run however is if you want to add more fields to the table. In this model your table grows wider and wider and can get messy in a hurry. The current model allows for more flexibility in this department, but is harder to query. You've got to make the judgment call depending on what you're doing now, and what you might want to do in the future. You've got to be careful not to create redundancy in data, that's a big thing that databases try to avoid, so if you decide to change the model, I'd truly make it a change and not add it in addition to - UNLESS - you create a whole new database that runs a batch process and nightly updates a reporting database. By doing this you can leave the application and DB as they are today, and move that data to an environment that is easier to write reports against, but if it has to be real time this probably won't work either. I wish I could give a better answer, hopefully I've at least given you some food for thought. Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-448083 Share on other sites More sharing options...
arthaeus Posted January 25, 2008 Author Share Posted January 25, 2008 When I said doing work on the server I really meant the database server... not necessarily using stored procedures so much as just running the query entirely on the database. The alternative is obviously to run some query that gets the data that you want... then having to write PHP code to further refine your search. It's just more efficient to let the DB server do the work if you can. Stored procedures don't buy you a ton... they really just save you the overhead of sending the query across the wire... since were dealing with a small amount of text it doesn't buy you much. Your proposed model is obviously easy to query, the risk you run however is if you want to add more fields to the table. In this model your table grows wider and wider and can get messy in a hurry. The current model allows for more flexibility in this department, but is harder to query. You've got to make the judgment call depending on what you're doing now, and what you might want to do in the future. You've got to be careful not to create redundancy in data, that's a big thing that databases try to avoid, so if you decide to change the model, I'd truly make it a change and not add it in addition to - UNLESS - you create a whole new database that runs a batch process and nightly updates a reporting database. By doing this you can leave the application and DB as they are today, and move that data to an environment that is easier to write reports against, but if it has to be real time this probably won't work either. I wish I could give a better answer, hopefully I've at least given you some food for thought. exactly the kind of answer I was looking for. many good points that i will certainly look into. Thank You Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-448681 Share on other sites More sharing options...
dbo Posted January 25, 2008 Share Posted January 25, 2008 You're welcome. Let me know where you end up. I'm interested to see your solution. Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-448939 Share on other sites More sharing options...
arthaeus Posted January 27, 2008 Author Share Posted January 27, 2008 You're welcome. Let me know where you end up. I'm interested to see your solution. Hey dbo what I ended up doing was creating a different table with ONLY the searchable fields included. I figured that since I will ONLY EVER use 9 fields to search, I wont need to make this table any wider. I will still be able to create additional fields for the profiles, but these will be displayed on the profile as tags. These tags are searchable via my tags algorithm, which is completely independent of my new table. This new table has DRASTICALLY reduced my query time from 9 seconds to .10 seconds when I search all 9 fields on a 100,000 member dummy db. OK, now can I get some input on these changes, and I have a couple questions about this approach. First, is .1 seconds acceptable for a query on a live site? If not, what range should I aim to get it within. (now that i think about it, i havent added an index to that table, so that should speed it up even more) Second, knowing that I will not ever expand my search table, was adding this table a smart thing to do to speed up my app? Thanks!! Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-450703 Share on other sites More sharing options...
dbo Posted January 27, 2008 Share Posted January 27, 2008 #1 - This is definitely acceptable. #2 - It's fine to have taken this approach if you eliminated the other table. If you've added this table in addition and have redundant data then not so much... because you've now introduced a dependency that these two tables be in sync or you could have problems. Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-450710 Share on other sites More sharing options...
arthaeus Posted January 28, 2008 Author Share Posted January 28, 2008 #1 - This is definitely acceptable. #2 - It's fine to have taken this approach if you eliminated the other table. If you've added this table in addition and have redundant data then not so much... because you've now introduced a dependency that these two tables be in sync or you could have problems. OK, now I can understand what you were talking about with the redundancy. If i change one of the tables, the other must change as well, introducing more work. How about this: I ONLY insert the search data into the search table, then extended profile data goes in the profile_data table. I believe this would remove the dependency and redundancy, while allowing for flexible profiles and quick search. You have really helped. Thank you! Quote Link to comment https://forums.phpfreaks.com/topic/87579-solved-the-problem-with-my-zip-codeslook-at-my-solution-is-it-a-go-or-no-go/#findComment-450853 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.