Kieran Menor Posted March 3, 2011 Share Posted March 3, 2011 I have a table with two columns: postal code and age (in years). I would like to have a query that lists the amount of rows for every age for every postal code. As in, something like this: Postal CodeAgeCount 10007050 100071100 10007230 20007025 2000710 200072200 ... and so forth. At first, it seemed easy enough, but then I realized that if no rows with a certain age exists for a certain postal code, it would not show in the result set. In the example above, the row with postal code 2000 and age 71 would be missing. Such gaps can be quite annoying when you work with the data later on. So, my question is, how do I fill the gaps? Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/ Share on other sites More sharing options...
Muddy_Funster Posted March 3, 2011 Share Posted March 3, 2011 could you show your actual current query? Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/#findComment-1182284 Share on other sites More sharing options...
Kieran Menor Posted March 3, 2011 Author Share Posted March 3, 2011 I don't think it has any relevance, but here it is: SELECT `zipcode` , ( YEAR( NOW( ) ) - CAST( SUBSTRING( `birthday` , 1, 4 ) AS SIGNED ) ) AS `age` , COUNT( * ) AS `count` FROM `p_member` GROUP BY `zipcode` , `age` ORDER BY `zipcode` , `age` It returns a result similar to the example in my first post, but as mentioned (and expected), it doesn't list rows where the count is 0. As I said, what I want is a list of all ages for each postal code, and the amount of rows that match, even if it is 0. Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/#findComment-1182286 Share on other sites More sharing options...
kickstart Posted March 3, 2011 Share Posted March 3, 2011 Hi Think to do this you need a list of all the zip codes and all the ages. Which can be done by selecting distinct zip codes and ages from the existing table and doing a cross join between them Something like this would work SELECT a.zipcode, b.age, COUNT(c.id) FROM (SELECT DISTINCT zipcode FROM p_member) a CROSS JOIN (SELECT DISTINCT (YEAR( NOW( ) ) - CAST( SUBSTRING( `birthday` , 1, 4 ) AS SIGNED )) AS `age` FROM p_member) b LEFT OUTER JOIN p_member c ON a.zipcode = c.zipcode AND b.age = (YEAR( NOW( ) ) - CAST( SUBSTRING( c.birthday , 1, 4 ) AS SIGNED )) GROUP BY a.zipcode, b.age ORDER BY a.zipcode, b.age This is assuming you have a unique id field on the table p_member (change the field name in COUNT(c.id) to the actual unique field name). All the best Keith Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/#findComment-1182296 Share on other sites More sharing options...
Kieran Menor Posted March 3, 2011 Author Share Posted March 3, 2011 Awesome, it seems to work. Thanks a lot. Although, it takes about a minute to execute. This doesn't matter much now as I only need to pull out the data every once in a while. The table only has about 6000 rows now, though, so I imagine it would get much worse later. Any ideas for optimization would be much appreciated. Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/#findComment-1182323 Share on other sites More sharing options...
kickstart Posted March 3, 2011 Share Posted March 3, 2011 Hi First thing I would do is to extract the zip code into a separate table, so you have a table of unique zip codes with p_member just storing the id field of the zip code. I would also put the date of birth into a date field so you can use datediff to get the age. Possible to also just generate a list of years for the 2nd subselect rather than trying the extract the range of years (this would also remove the situation where you will get a gap at the moment if nobody on your list was born one year). If you let me have your table definition I will have a play around. All the best Keith Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/#findComment-1182333 Share on other sites More sharing options...
Kieran Menor Posted March 3, 2011 Author Share Posted March 3, 2011 I imagine that you don't need all the irrelevant data fields. I didn't design this table myself, and now that I look at it, "birthday" is actually a date field. So, okay, the table looks something like this: CREATE TABLE `members` ( `id` int(12) NOT NULL AUTO_INCREMENT, `zipcode` varchar(55) NOT NULL DEFAULT '', `birthday` date NOT NULL DEFAULT '0000-00-00', PRIMARY KEY (`id`) ) ENGINE=MyISAM ; "zipcode" is always a 4 digit number, so you can definitely use another data type to increase efficiency. Also, I see that the table currently has no indices (other than PRIMARY). Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/#findComment-1182371 Share on other sites More sharing options...
kickstart Posted March 3, 2011 Share Posted March 3, 2011 Hi First set up a table of zip codes CREATE TABLE IF NOT EXISTS `zipcodes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `zipcode` varchar(25) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=11 ; Populate that with all the zip codes. Add a column to the members table called zipcodeid. Populate this with the id from the zip code table for the members zip code. Put an index on this column. Create a table of years. CREATE TABLE IF NOT EXISTS `years` ( `id` int(11) NOT NULL AUTO_INCREMENT, `year` int(11) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `year` (`year`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=21 ; Populate that with the years you care about. Then have your SQL as SELECT a.zipcode, YEAR(now()) - b.year, COUNT( c.id ) FROM zipcodes a CROSS JOIN years b LEFT OUTER JOIN members c ON a.id = c.zipcodeid AND b.year = YEAR( c.birthday ) GROUP BY a.zipcode, YEAR(now()) - b.year ORDER BY a.zipcode, YEAR(now()) - b.year Note this is a bit of a cheat as it is basing age on just the year. All the best Keith Quote Link to comment https://forums.phpfreaks.com/topic/229469-filling-the-gaps/#findComment-1182390 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.