During a discussion about OLAP cubes for Business Intelligence data visualization and analysis, I was asked if regression lines could be added to cubes as calculations within Analysis Services. I had previously browsed the MSDN pages for the MDX functions LinRegPoint and LinRegSlope, but since I had never used them I replied “You should be able to create a regression line as an SSAS calculation, but I’ve never actually tried to do so myself.” I don’t like qualifying a statement with “should be able to,” so I did some research which has now evolved into this article.
It should be noted that Excel has a regression line function available in the ribbon that works with pivot tables, but those calculations are happening after the data has reached Excel. Also, the regression line function in Excel works best with traditional pivot tables and can behave clumsily when used with slicers and filters for an OLAP data connection. PerformancePoint dashboards for SharePoint do not have a function for adding regression lines to charts, so a regression calculation can definitely add value for PPS dashboards.
In researching the topic of regression calculations using MDX in OLAP cubes, several blog posts and discussion boards pointed to an older article by Mosha Pasumansky. The article is very thorough, and can be found at this link. I referenced it frequently to help determine the methodology below.
First, I built a simple OLAP cube in Analysis Services using an old AdventureWorks data set. Notice that I included a Measure on the Date table named [Measures].[Date Key]. The measure is based upon an integer key for year-months (such that July 2013 is “201307″) with an Aggregate Function of “Max.” [Measures].[Date Key] will be used in a subsequent calculation that will quantify the chronological order of months:
The regression line will be on the most recent 12 Year-Months with data (January-2007, February 2007, etc.), and will be calculated for [Measures].[Sales Amount] (a core measure that sums up total $). After creating a static set for [Rolling 12 Months], [Measures].[Month Key] is used within a RANK function. The function will calculate the first month in the set to be “1,” the second to be “2,” etc. in order to provide a standard increment for the regression to use as the x-axis:
CREATE MEMBER CURRENTCUBE.[Measures].[Rank 12 Months] AS
RANK([Dim Date].[Month Year].CurrentMember
,ORDER([Rolling 12 Months]
VISIBLE = 0;
Next, the MDX function for LineRegPoint can be used to determine the y-axis value for each month, evaluated as a regression for the most recent 12 months:
CREATE MEMBER CURRENTCUBE.[Measures].[12 Month Trend] AS
LinRegPoint([Measures].[Rank 12 Months]
,[Rolling 12 Months]
,[Measures].[Rank 12 Months]
FORMAT_STRING = “Currency”, VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = ‘Fact Internet Sales’;
Once [Measures].[12 Month Trend] has been deployed to the cube, it is available to pull in as a regression line for [Measures].[Sales Amount] with [Rolling 12 Months] on the Axis. Any dimension attribute connected to the ‘Fact Internet Sales’ measure group can then be used on the series in Excel and Excel Services, PerformancePoint Dashboard line charts, or for any other visualization tool that creates line charts using an OLAP data source:
UPDATE: A video detailing the building of a regression line is now available from GNet Group at this link.