Regression functions calculate the "best fit" straight line through a set of two variables. The statistics produced by the regression help determine if there is an underlying meaningful relationship, or forecasting ability, between the independent variable (x) and the dependent variable (y). To use regression, you need to first use a LinReg function, then you can use factors to access the statistics. Multiple regression in your formulas and ranking systems are allowed using precise placement. Please see the examples section where you will find some code to get started as well as a link to a spreadsheet with all the calculations.
Note: Regression functions are only available for Ultimate Subscriptions.
Do one of these first.
LinReg("Formula(CTR)", iterations[, start, increment])
LinRegXY("X-Formula(CTR)", "Y-Formula(CTR)", iterations[, start, increment])
LinRegVals(y0, y1, …, y50)
LinRegXYVals(x0, y0, …, x50, y50)
The functions above return true if regression is completed, and false otherwise. The first two use a loop formula to generate the samples. The next two you supply the values directly. The "XY" regressions are general regressions where the X is explicitly supplied. The regressions that only let you specify the Y parameter are time series regressions where the X represents time.
With "formula(CTR)" functions, the special CTR variable must be used inside "formula". By default, CTR starts at 0 and is incremented by 1 for each iteration. (This can be overridden by specifying start and increment respectively.) The formula will be executed multiple times specified by the "iterations" parameter.
The factors and functions below access the statistics for the nearest regression placed to the left or above in the formulas.
R2 Coefficient of determination. Ranges from 0 to 1. It is an indicator of how well the model explains the movement in the data. For instance, an R² of 0.8 means that the regression model explains 80% of the variability in the data.
R Correlation coefficient between the observed and predicted values. It ranges from -1 to +1. An R-value of -1 and +1 indicate respectively a perfect negative and positive relationship between the independent and dependent variable. Thus, an R-value of 0 indicates there is no relationship between these variables.
Slope Returns the slope of the regression.
RegGr%(period=1) Regression growth is a compounded growth rate based on the regression line. The period can be used to normalize the growth rate to a different time frame than the samples. For example, to annualize quarterly values set the period to 4. This factor is only available for time series regression. The formula used is: 100*(Power((Est(0)-Est(N-1))/Abs(Est(N-1))+1,period/N)-1)
SurpriseY(offset) % difference of the estimated Y vs actual
EstimateY(offset) Estimated Y
For LinReg() regressions, offset corresponds to CTR in the formula
For LinRegVals()
See link to spreadsheet in the Further Reading section for examples.
SlopeSE The Standard Error of the regression slope.
SlopeTStat T-statistic for the regression slope
SlopePVal P-value for the regression slope using SlopeTStat
SlopeConf Confidence in the regression slope, calculated as (1 - SlopePVal) * 100
SE Standard Error, also called Sy.x
Intercept The Y-intercept
InterceptSE The Y-intercept Standard Error
Samples Number of observations in the regression
Time Series Regression
To find stocks where the 10Y sales regression has: 1) a positive slope 2) a good R² of at least 0.8 and 3) latest sales above the trend, you could type the following
LinReg("Sales(CTR, ANN)", 10) = TRUE R2 > 0.8 and Slope > 0 and SurpriseY(0) > 0
XY Regression
To find company where the latest EPS is above the expected EPS for a given revenue, you could do the following:
LinRegXY("Sales(CTR, ANN)", "EPSExclXor(CTR, ANN)", 10) SurpriseY(0) > 0
Ranking System Example
In a ranking formula, you can use this one liner to rank using the slope of the past 60 prices. Eval is used to do the following: if the regression is successful, the Slope is used for ranking, otherwise NA.
Eval(LinReg("Close(CTR)", 60), Slope, NA)
Click here to open a spreadsheet example of all formulas
For every fundamental line item you can find predefined Annual and Trailing Twelve Months smoothed factors for a) most recent estimate and b) the annualized regression growth. We predefine them for two periods:
You will find these smooth factors in the reference of each line item.
For Example these are the Smooth factors for "Sales"
Factor | Equivalent to |
SalesRegGr%TTM | Eval(LinReg("Sales(CTR, TTM)", 10, 0, 2), EstimateY(0), NA) |
SalesRegGr%TTM | Eval(LinReg("Sales(CTR, TTM)", 10, 0, 2), RegGr%(2), NA) |
SalesRegEstANN | Eval(LinReg("Sales(CTR, ANN)", 10), EstimateY(0), NA) |
SalesRegGr%ANN | Eval(LinReg("Sales(CTR, ANN)", 10), RegGr%(1), NA) |