Print Page  
Advanced Functions / Loop Regression
LinRegXY("X-Formula(CTR)", "Y-Formula(CTR)", iterations[, start, increment])
Full Description

Regression functions calculate the "best fit" straight line through a set of two variables. The statistics produced by the regression help determine if there is an underlying meaningful relationship, or forecasting ability, between the independent variable (x) and the dependent variable (y). To use regression, you need to first use a LinReg function, then you can use factors to access the statistics. Multiple regression in your formulas and ranking systems are allowed using precise placement. Please see the examples section where you will find some code to get started as well as a link to a spreadsheet with all the calculations.

Note: Regression functions are only available for Ultimate Subscriptions.

Functions that evaluate a regression:

Do one of these first.

LinReg("Formula(CTR)", iterations[, start, increment])
LinRegXY("X-Formula(CTR)", "Y-Formula(CTR)", iterations[, start, increment])
LinRegVals(y0, y1, …, y50)
LinRegXYVals(x0, y0, …, x50, y50)

The functions above return true if regression is completed, and false otherwise. The first two use a loop formula to generate the samples. The next two you supply the values directly. The "XY" regressions are general regressions where the X is explicitly supplied. The regressions that only let you specify the Y parameter are time series regressions where the X represents time.

With "formula(CTR)" functions, the special CTR variable must be used inside "formula". By default, CTR starts at 0 and is incremented by 1 for each iteration. (This can be overridden by specifying start and increment respectively.) The formula will be executed multiple times specified by the "iterations" parameter.

Main Regression Statistics

The factors and functions below access the statistics for the nearest regression placed to the left or above in the formulas.

R2 Coefficient of determination. Ranges from 0 to 1. It is an indicator of how well the model explains the movement in the data. For instance, an R² of 0.8 means that the regression model explains 80% of the variability in the data.

R Correlation coefficient between the observed and predicted values. It ranges from -1 to +1. An R-value of -1 and +1 indicate respectively a perfect negative and positive relationship between the independent and dependent variable. Thus, an R-value of 0 indicates there is no relationship between these variables.

Slope Returns the slope of the regression.

RegGr%(period=1) Regression growth is a compounded growth rate based on the regression line. The period can be used to normalize the growth rate to a different time frame than the samples. For example, to annualize quarterly values set the period to 4. This factor is only available for time series regression. The formula used is: 100*(Power((Est(0)-Est(N-1))/Abs(Est(N-1))+1,period/N)-1)

SurpriseY(offset) % difference of the estimated Y vs actual
EstimateY(offset) Estimated Y
For LinReg() regressions, offset corresponds to CTR in the formula

  • Offset for most recent = 0
  • Offset for future = -1
  • Offset for oldest = (Samples - 1)

For LinRegVals()

  • Rightmost value offset = (Samples - 1)
  • Leftmost value offset = 0

Other Regression Statistics

See link to spreadsheet in the Further Reading section for examples.

SlopeSE The Standard Error of the regression slope.

SlopeTStat T-statistic for the regression slope

SlopePVal P-value for the regression slope using SlopeTStat

SlopeConf Confidence in the regression slope, calculated as (1 - SlopePVal) * 100

SE Standard Error, also called Sy.x

Intercept The Y-intercept

InterceptSE The Y-intercept Standard Error

Samples Number of observations in the regression

Examples

Time Series Regression

To find stocks where the 10Y sales regression has: 1) a positive slope 2) a good R² of at least 0.8 and 3) latest sales above the trend, you could type the following

LinReg("Sales(CTR, ANN)", 10) = TRUE
R2 > 0.8 and Slope > 0 and SurpriseY(0) > 0

XY Regression

To find company where the latest EPS is above the expected EPS for a given revenue, you could do the following:

LinRegXY("Sales(CTR, ANN)", "EPSExclXor(CTR, ANN)", 10)
SurpriseY(0) > 0

Ranking System Example

In a ranking formula, you can use this one liner to rank using the slope of the past 60 prices. Eval is used to do the following: if the regression is successful, the Slope is used for ranking, otherwise NA.

Eval(LinReg("Close(CTR)", 60), Slope, NA)

Further Reading

Click here to open a spreadsheet example of all formulas

Original Release Announcement

Regression Formulas

Line Item Smoothed Factors

For every fundamental line item you can find predefined Annual and Trailing Twelve Months smoothed factors for a) most recent estimate and b) the annualized regression growth. We predefine them for two periods:

  • 5Y of TTM values sampled twice a year for a total of 10 samples
  • 10Y of ANN values samples

You will find these smooth factors in the reference of each line item.

For Example these are the Smooth factors for "Sales"

Factor Equivalent to
SalesRegGr%TTM Eval(LinReg("Sales(CTR, TTM)", 10, 0, 2), EstimateY(0), NA)
SalesRegGr%TTM Eval(LinReg("Sales(CTR, TTM)", 10, 0, 2), RegGr%(2), NA)
SalesRegEstANN Eval(LinReg("Sales(CTR, ANN)", 10), EstimateY(0), NA)
SalesRegGr%ANN Eval(LinReg("Sales(CTR, ANN)", 10), RegGr%(1), NA)