amedhussaini Posted April 21, 2008 Share Posted April 21, 2008 although i'm in the beginning stage of web development with php/mysql, i'm always looking ahead trying to forecast potential problems ahead of time. my question pertains to social networking sites and database design. its my understanding that to replicate a circle of friends, etc, it only requires two tables. if these tables have hundreds of thousands of rows, how do these sites spit out queries so fast? from what i know, an index speeds up query returns, but every time a table changes, an index must be created again from the new data table. my question then: 1. 100,000 rows of data in a table. indexing is on. we insert a new row of data.. now there are 100,001 rows of data. will the re-index take a lot of cpu cycles or time? or is it negligible since its only a small modification to the 100,000 rows of data already indexed? thanks in advance! cheers, amed Link to comment https://forums.phpfreaks.com/topic/102053-question-re-indexing-highly-dynamic-database-tables/ Share on other sites More sharing options...
ohdang888 Posted April 21, 2008 Share Posted April 21, 2008 they use unique ids when they set up the table... so row 1 has an id of 1, row 2 has #2, etc. So they just do "SELECT * FROM table WHERE id='$id'" and when they insert, they don't bother to do 'id' VALUES ('id'), its automatically done. Link to comment https://forums.phpfreaks.com/topic/102053-question-re-indexing-highly-dynamic-database-tables/#findComment-522340 Share on other sites More sharing options...
amedhussaini Posted April 21, 2008 Author Share Posted April 21, 2008 i think that goes without saying.. let me restate the problem again, just in case i stated it incorrectly.. Table 1 [id] [username] [blah] [blah] Table 1 is a basic "members" table. Table 2 - add a friend [id] [id -from table 1] [id of new friend, also from table1] each time you add a friend, an entry is put in table 2. to pull all of Sarah's friends, for example, we'd query table 2 for all entries that have her id. so.. my question was, if table 2 has hundreds of thousands of entries, perhaps even millions, how do these social networks spit out returns so quickly? cacheing? indexing? if it is indexing, by indexing the second column in table 2, each time a person adds a friend, a new row is inserted into table 2 and it has to be indexed again.. is this re-index insanely cpu intensive? or marginal? I hope that is a bit more clear.. I apologize if it just mucks things up lol. Link to comment https://forums.phpfreaks.com/topic/102053-question-re-indexing-highly-dynamic-database-tables/#findComment-522375 Share on other sites More sharing options...
jaymc Posted April 21, 2008 Share Posted April 21, 2008 yeh I wonder how too Its marginal though mate, I had a table with 7.5 million rows in that was the most frequently accessed and updated table Still, queries including inserts updates and selects where instant The genious of the database I guess It does get a bit dirty as you go on though so always better to split mysql load or cluster I guess thats what the big dogs do Does anyone else have any input on this Link to comment https://forums.phpfreaks.com/topic/102053-question-re-indexing-highly-dynamic-database-tables/#findComment-522654 Share on other sites More sharing options...
Gamic Posted April 21, 2008 Share Posted April 21, 2008 They could (and probably do) decide to have the data in the friends table stored in such a way that all friends of person one are stored in the same block(s) on the hard disk (rather than spread around different blocks). This would reduce the number of read operations required to actually get the information from the hard disk to the memory. This might mean that multiple instances of the same friend relationship occur to help speed up these queries. id | user_id | friend_id | storage_block 1 | 1 | 2 | 1 2 | 1 | 3 | 1 3 | 2 | 1 | 2 4 | 3 | 1 | 3 5 | 1 | 4 | 1 6 | 4 | 1 | 4 Then all the index would need to store is that for user_id=1 the storage_block=1 Which means that we could get all friends of user_id = 1 in one block read (rather than 4 block reads). Of course, this is a simplified example . When there are millions of rows you can see how this approach would help save a lot of time. Link to comment https://forums.phpfreaks.com/topic/102053-question-re-indexing-highly-dynamic-database-tables/#findComment-522685 Share on other sites More sharing options...
jaymc Posted April 21, 2008 Share Posted April 21, 2008 How do you think tables that have 200 million rows are dealt with Link to comment https://forums.phpfreaks.com/topic/102053-question-re-indexing-highly-dynamic-database-tables/#findComment-522710 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.