I am afraid I am going to go full developer on this feature request. So please bear with me on this request as I believe this would be a powerful feature for Phocas.
Currently Phocas rebuilds all data from the streams it has been provided. With some of our databases, we could have data going back several years (consider closed periods) but is required for trend analysis. This means that rather than the database rebuilding only the open period data (volatile), Phocas rebuilds all data equally. The end result is a large build time, when some of the stored aggregations will never change. One approach is to split the database up to several smaller ones. However in the land of aggregated data, I consider the split should be done within the SAME database, not separate ones. The advantages are:
- You have a single reference point and can perform trending on a larger scale
- Allow the users to view all time aspects of the within the same database.
- Reduce development time on dashboards. If we split the database to several it means multiple dashboards for multiple time-frames and/or splits to geographically(country by country).
In MS SSAS you have the concept of partitions, this allows the developer to mark sections of the data as closed and no longer requires rebuilding. In basic terms, don’t build parts of the data that we know are never going to change. With this partitioning it would mean lengthy builds reduce to a fraction of there original time as you would not need to constantly build all the data.
How would this work
As Phocas GUI is very user friendly, I suggest we leverage the streams in the designer to handle this. For example (using two feeds split by fiscal year):
You have data from 2010 to today, you know your closed periods are 2010 - 2017, therefore you would have two streams:
Stream 1 - Open Periods(2018 - today)
Stream 2 - Closed Periods.(2010 - 2017)
On the Tab name for the stream we have an option to make it not to rebuild (sorry for crude image!):
The above states that “PandL” will be built nightly whereas “PandL2017” (contains data up to 2017) does NOT get rebuilt and when it is first loaded it remains with the saved aggregations.
Adding partitions to a cube (MS SSAS) is configured differently, however if we were to use streams in this manner the end user is in control of what gets processed. If the closed data stream DOES need to change (for example a new dimension is added) then we would check the box, rebuild and un-check after. Again in the control of the developer managing the cube.
I appreciate the above is a feature which would require effort to develop, but a worthy one. Please let me know what you think on this idea