illcrx Posted September 24, 2014 Share Posted September 24, 2014 (edited) I am writing a stock program and I am going to be getting several averages on volume over different periods, 10, 21, 50,100,120,150,200,240 are my averages. I would like to make this fairly efficient, seeing as how its going to be a ton of data, I have over 7700 stocks in the database and they will each have all of those averages. My question is how to do this that is the most efficient, I thought about taking the 10 day then adding 11 to it, then adding 29 to get the 50 day and so on and so forth, or is that too much work and I should just do each average individually? select avg(volume) from _AAPL where date between ‘2014-02-03’ and ‘2014-05-05’; The above is my current very generic code that does work, now I just need to make a php script, which I can do, I am just curious about how different people would approach this. Thanks for the help. Edited September 24, 2014 by illcrx Quote Link to comment Share on other sites More sharing options...
requinix Posted September 24, 2014 Share Posted September 24, 2014 (edited) Off the top of my head, SELECT AVG(IF(/* date in range 10 */, volume, NULL)) AS range10, AVG(IF(/* date in range 21 */, volume, NULL)) AS range21, AVG(IF(/* date in range 50 */, volume, NULL)) AS range50, (etc) FROM _AAPL WHERE /* date within the widest range you're counting */Because AVG() will completely ignore NULL values. But... are you using a table for each stock quote? That's not good. Edited September 24, 2014 by requinix Quote Link to comment Share on other sites More sharing options...
illcrx Posted September 24, 2014 Author Share Posted September 24, 2014 (edited) I need to hold _AAPL data by date, every day so I have the dates as the columns, there are many many variables that are going to go in each row so thats what I came up with. I already have a script to go over every stock so that its not labor intensive getting all the data from all of the stocks. Am I missing something is there a better way to do that? Edited September 24, 2014 by illcrx Quote Link to comment Share on other sites More sharing options...
requinix Posted September 25, 2014 Share Posted September 25, 2014 I'm positive there is. Can you post the CREATE TABLE statement for the table? You can use a SHOW CREATE TABLE _AAPL to get it. Maybe with a couple rows of sample data to see how it's being used? It'd also help to know some of the kinds of queries you're running against it. Quote Link to comment Share on other sites More sharing options...
illcrx Posted September 26, 2014 Author Share Posted September 26, 2014 (edited) -> +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -> | _AAPL | CREATE TABLE `_AAPL` ( -> `date` date DEFAULT NULL, -> `open` decimal(16,4) DEFAULT NULL, -> `high` decimal(16,4) DEFAULT NULL, -> `low` decimal(16,4) DEFAULT NULL, -> `close` decimal(16,4) DEFAULT NULL, -> `volume` double(15,0) DEFAULT NULL -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1 | -> +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- I have 27 more rows to put in here, the dates are the columns, I will be querying them to do averages, comparing to those averages and using them to output signals. So I will be doing a lot with it. I am new to databases and its a fun experience, but I have a lot to learn! Obviously which is why there is no primary key or any other more definitive characteristics about the data. I am still in very early stage. | date | open | high | low | close | volume | +------------+----------+----------+----------+----------+-----------+ | 2014-01-01 | 80.1457 | 80.1457 | 80.1457 | 80.1457 | 0 | | 2014-01-02 | 79.3829 | 79.5757 | 78.8600 | 79.0186 | 8398800 | | 2014-01-03 | 78.9800 | 79.1000 | 77.2043 | 77.2829 | 14036100 | | 2014-01-06 | 76.7786 | 78.1143 | 76.2286 | 77.7043 | 14765500 | | 2014-01-07 | 77.7600 | 77.9943 | 76.8457 | 77.1486 | 11347500 | | 2014-01-08 | 76.9729 | 77.9371 | 76.9557 | 77.6371 | 9240900 | | 2014-01-09 | 78.1143 | 78.1229 | 76.4786 | 76.6457 | 9986400 | So next to volume I have ALTER TABLE stocks ADD (day tinyint(1), 10day int, //these are averages of prices over x days 21day int, 50day int, 100day int, 120day int, 150day int, 200day int, 240day int, 20day_vol int, 50day_vol int, pp_limit int, PP are volume signatures pp_day tinyint(1), volPercentAvg double (5,2), quiet_volume tinyint(1), signal derrived in php up_day tinyint(1), down_day tinyint(1), percent_close double (5,2), percent_in_day double (5,2), will tell where the close was in relation to the high/low in percentage support_day tinyint(1), derrived from above variable 8dayPP_limit int, 8dayPP_day tinyint(1), distribution_day tinyint(1), 50day_test tinyint(1), test of the 50ma 50day_break tinyint(1), continuationPP tinyint(1)); PP based on things. So as you can see I have a lot going on, I thought about this from the bottom up considering there are so many stocks and I need to do so much with each of them. So my architecture goes like this DB= Stocks_Daily Table=_AAPL or appropriate stock symbol column= dates rows = data for each date Every day all of this will be derrived and updated, I dont even have architecture yet for the real time, this is all historical. But it should be pretty similar I would guess. Thanks for checking it out! Edited September 26, 2014 by illcrx Quote Link to comment Share on other sites More sharing options...
Barand Posted September 28, 2014 Share Posted September 28, 2014 (edited) That's 7700 tables isn't it. What you should have is -> CREATE TABLE `stock_price` ( -> `stock` VARCHAR(6) DEFAULT NULL, -> `date` date DEFAULT NULL, -> `open` decimal(16,4) DEFAULT NULL, -> `high` decimal(16,4) DEFAULT NULL, -> `low` decimal(16,4) DEFAULT NULL, -> `close` decimal(16,4) DEFAULT NULL, -> `volume` double(15,0) DEFAULT NULL, -> Primary key(stock, date) -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1 where the added stock column contains the "AAPL" etc. One of the rules when constructing a db is "no derived data" so all those averages, percentages etc are not required. You calculate those in your queries when required. edit: if you have a separate "stock" table with details of the quoted stock (exchange, industry type etc) then the stock column above should be the id of the "stock" record. Edited September 28, 2014 by Barand Quote Link to comment Share on other sites More sharing options...
illcrx Posted September 29, 2014 Author Share Posted September 29, 2014 That's 7700 tables isn't it. What you should have is -> CREATE TABLE `stock_price` ( -> `stock` VARCHAR(6) DEFAULT NULL, -> `date` date DEFAULT NULL, -> `open` decimal(16,4) DEFAULT NULL, -> `high` decimal(16,4) DEFAULT NULL, -> `low` decimal(16,4) DEFAULT NULL, -> `close` decimal(16,4) DEFAULT NULL, -> `volume` double(15,0) DEFAULT NULL, -> Primary key(stock, date) -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1 where the added stock column contains the "AAPL" etc. One of the rules when constructing a db is "no derived data" so all those averages, percentages etc are not required. You calculate those in your queries when required. edit: if you have a separate "stock" table with details of the quoted stock (exchange, industry type etc) then the stock column above should be the id of the "stock" record. So under this solution instead of 7700 tables with thousands of columns, one for each date, I have one supermassive table? Wouldnt the symbols be duplicated for every single day? Is that more efficient or would that just take up a bunch of room duplicating data? Quote Link to comment Share on other sites More sharing options...
Barand Posted September 29, 2014 Share Posted September 29, 2014 With a "stock" table, only the id of the stock record would be duplicated. That's how relational databases work - the records are related by the keys and foreign keys. What is the highest priced stock? Which stock prices moved most in a particular time period? How would you query those with 7700 tables? Quote Link to comment Share on other sites More sharing options...
illcrx Posted September 29, 2014 Author Share Posted September 29, 2014 (edited) Right now I have the columns as the dates and would just query by the date and the table as the $symbol mysql_query ("update _$symbol set volPercentAVG=$volPercentAVG WHERE date=$todaysdate;"); mysql_query ("update _$symbol set continuationPP=$continuationPP WHERE date=$todaysdate;"); That is what I have now, the _$symbol is the table with $condinuationPP or whatever column as the place Im entering data. Edited September 29, 2014 by illcrx Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.