Ranking and market timing in combination for stock forecasting models

yupolv · October 17, 2015, 1:41am

I started this post in other branch but decided to make a separate theme for discussion.

So we discuss stock return forecasting models.

Stock movements can be decomposed into two pieces:

Market movements
Stock specific movements

First, let’s start from the market part.
We already considered Hull’s paper approach to market timing models. I think it is the most logical and powerful approach.
We take macro factor universe (let’s say 20 different factors). Construct linear function on it. Then we optimize it’s coefficients (factor’s weights)
to get maximum correlation for our objective linear function to future 3M, 6M, 12M, 2 years average market return (SPY for example) based on shifted time frames (10 years for example).
Every week we update it. So ok. It shows 0.5 correlation to 6M and 0.7 for 12M, for example. The less forecasting period the less correlation and forecasting power, but higher potential annual return (cause we can buy and sell 4 times during a year based on 3M model and make profit maybe bigger than based on more reliable 12M model but trading only once a year). Based on that objective function and it’s current correlation you can choose the way use it in your trading in practice (I wrote about it).

Second, stock specific movements.
We use the same approach. We have bigger stock factor universe (let’s say 100 in P123 for example) and 4000 thousands tradable stocks.
Then we construct the same linear objective function with that factors. Choose forecasting times periods (assume 2 weeks, 1M, 3M, 6M).
Optimize it to get maximum correlation to future stock return based not on 16 years (as popular in P123) but on less periods, let’s say market cycle period 5-7 years. When we get objective function for each stock we averaged it weights based on sector or industry allocation (find average weights within market sector in order to avoid over-fitting).
When we get averaged function we screen our stock universe for the highest current correlation. Then we rank screened stocks by highest objective function value for long positions and lowest for short. Then we buy/short some portion of it (up and bottom part) based on our liquidity requirements. Then we exit position when stock drops from correlation screen/rank and objective function rank.

Further, we can combine our MT and specific model into one objective function that shows us what to do at any time and market conditions.

Right now we make our stock specific objective function manually (ranking system) and can observe visually only rank correlation to stocks ret on average period of time then stocks lie into specific rank (rank performance graph shows rank - ret distribution on variable time frame inserted by user but based only on forecasting period that equals to average time when stocks remain in specific rank). We can’t calculate exact correlation value for our rank to separate stocks or it’s combination. And that’s the main problem.
We choose highest rank (ranked objective function) to buy but we don’t know how reliable it is. More often than not it has high rank but low predicting power due to low correlation in current conditions (and maybe in any conditions due to extreme fitting in ranking development on 5 stocks port models). That’s why the most models don’t work OOS.

yupolv · October 18, 2015, 10:53pm

So, let’s start from the general stock market forecast model - market timing model in other words based on S&P500 index.

MT model construction.

Testing period Feb1999-Sep2015.
Objective function - linear function with equally weighted factors (to avoid curve fitting)
Factor universe:
A) Getseries section from P123 for stocks specific factors (data available from 1999, that’s why I show the system from that year) on weekly basis
B) FRED data (available from 1947 year) on monthly basis
C) Recalert.com data from 1968 on monthly basis

I took 9 predictor factors that showed strongest correlation to future realized 1m, 3m, 6m, 12m S&P500 returns and robust&logical independent meaning. No factor screening procedures have been made (as in Hull’s paper to avoid curve fitting and for simplicity purposes), factors invented before 1999 (to avoid look ahead bias).
Factors have been transformed and normalized at zero mean. Composite MT9 index is a linear function with equally weighted factors - simple average of 9 factors.
Correlation to 12m SPY future realized return is 0.63, 6m - 0.51, 3m - 0.41.

Implementation in practice.

Based on the MT9 index I adjust my position in SPY. Down limit - 70% in short and 170% is cash, up limit - 120% in long and -20% in cash based on mean position in market at 60% plus 2x levered MT9 index. Slippage - 0.5%. Average system turnover - 85%. That’s a critical point. Then you use on/off hedge instead variable (as only available now in P123) you should change position by hundred percent from let’s say 100% in market to 100% cash or other assets, that leads to higher turnover and higher transaction costs. Especially that is relevant to lower liquidity assets than SPY, russell 2000 stocks for example (that’s why I put high transaction costs - 0.5%). Cash interest - 1.5% (let’s say average treasure instruments interest).

The result of this strategy presented on the picture below. Return is two times higher, st dev 1.5 times lower, median return/risk (sharpe) - three times higher - 0.82.
Also it is interesting to note that MT index started to drop from June 2015 and as of end of Aug or beginning of Sep showed negative value (mean is zero), trading system directed to stay only 50% in the market as of july two months earlier than the recent drawdown happened.

P.S. Market moving average factors (which are very popular) don’t show significant correlation to future returns as well as stocks short interest factor. When somebody build MT and trading strategy based on it that means they just try to fit the market curve. That’s all. It won’t work in future. More or less reliable MT can be constructed and based only on factors that have economical and independent meaning. One technical factor that has some influence is the mean reversion or negative auto-correlation in other words.

In the next posts I will discuss the stock specific models. Regards, Yury.

yupolv · October 24, 2015, 12:02pm

1. There is one addition to the MT post.
Some predictor factors show greater correlation to long forecast period like 1 year or more, but some to 3 month.
Also short term factors available on weekly basis while the long term data is not updated yet. Therefore it is good to use both short term and long term MT models simultaneously.

2. Stock specific models. Ranking vs Screening vs Regression.
We use ranking as the best available stock picking strategy. There is a good explanation to this approach.
While linear regression models work good (or at least acceptably) on macro scale it doesn’t show the same results on individual stock forecasts.
Factors correlation to future specific stock returns that not explained by the market movements (alpha) is very unstable and depends on many other conditions, on average correlation is zero.
Because of alpha concentrates within limited period of time and usually at extreme factor values. If you imagine multidimensional factor-alpha surface alpha is floating on it. Making ranking allow us to capture bumps on that surface. Also ranking is robust to outliers (it is very important) comparing to linear models. There is a good paper from two Russian guys about all this stuff I attached to the post.
Ranking vs screening. Ranking has the flexibility advantage over the screening, because of cycliсity of factor’s performance. When you filter out factors using screens you punish stocks to much comparing to ranking method. Also screens grab stocks with no differentials between them, therefore you can’t pick extreme values as in ranking.

3. Rank construction.
Initially we make a bunch of independent themes based on well-known anomalies and predictor factors that have economic means and common sense logic. We combine closely correlated factors within one theme together on equally weighted basis. Choose factors that usually not missed in data basis (not exotic ). I counted 10 independent themes, each has from 5 to 20 equally weighted factors. Combining uncorrelated themes together gives diversification benefit. Also if you use monotonic function as factor transformation it won’t affect assigned rank to the factor (for example ranked logarithm cap or just simple cap is the same in ranked data).
Themes weighting in total ranking. It depends on the purpose of a system. I use simple separation: defensive, balanced, aggressive. Depending on style I put weights on themes based on simulated performance within the range such that the maximum is not higher than 3 times minimum weight.

Good ranking should work fine not only during all 16 years of backtest, but mainly during assigned period (bull, bear, flat, high - low volatility etc). Good ranking should show more or less stable result in rank-realized alpha distribution histogram (we see rank – return only, not adjusted to risk and market impact and that’s very bad) during assigned period. It means the following: lower number of outliers on graph (but nor very low), higher slope, some convexity (cause we use ranked data not actual values, outliers in 1 and 100 rank percentile for example make big contribution to return) but it shouldn’t be very big because it can be a sigh of over-optimization again (if it looks ideally it can’t be true). Altogether it means higher correlation of ranks percentiles to future realized alpha (or spearman correlation of actual not ranked values) and higher potential returns.

4. Portfolio construction
On the attached pictures you can see that even good ranking shows unstable result on less than 3 years period. The less the period the more noise in it. But you don’t want to wait 15 years to get required result, you invest on 1-3 years period. Therefore to get the less noise you have to use more stocks in your port (in my view 20 stocks is absolute required minimum). In that way you compensate noise and smooth result (especially it will be clear on risk adjusted basis).

Buy and sell rules. The less number of rules is the better for OOS. I prefer filter universe by balanced ranking in buy rules to exclude shitty stocks then apply more concentrated ranking to specific purpose or theme. Also in practice may be useful such rules: sell before earnings and quarterly reports release dates (you rely on your system not on luck, rank can change dramatically after release date, but it will be late for you to response), take profit and stop loss orders especially in short systems, rank change during recent weeks (current rank to the average previous rank, especially in highly correlated rank-alpha stocks). In other words logical rules that independent to your ranking are welcome.

In the next post we discuss the critical features to add in P123.
Regards, Yury.

Rytchkov_RankingStocks_10A.pdf (259 KB)

yupolv · October 26, 2015, 1:39pm

Multidimensional Alpha - Factors surface.

I already mentioned this conception in my previous post but let’s examine it deeper.

For simplicity let’s imagine 3 dimensional space we accustomed to. x and y axis are predictor variables-factors and z is alpha.
Then alpha surface would look like a sea, discrete sea consisting all public assets universe. Chaos at first glance. But crests and cavities of alpha waves exist in that space. It is moving all the time. The average level of that sea is zero alpha (only beta - the whole world market).
Larger longer term waves embrace smaller and shorter waves. If you set up your factor ranking system properly (set up x-y area) you will catch positive alpha crests.
The more money you need to invest the lower you cut your wave (wider x-y area) and finally you cut the sea level - achieving zero alpha and only beta (your x-y area is a whole flatness), you are completely in the market (cap weighted index).
By fitting your ranking and overall models based on history data you just trying to catch the previous wave crests, but it is more likely that only cavities left on that space your are trying to catch (you fit your system to trace old maximum alpha path that will not be the same again, at least the chance it will happen again is close to zero). Also the problem is that it can form not only waves but any other geometric figure due to not monotonic factors nature and stochastic deviation (which some of them can be explained by other factors that our space didn’t include). By trying to catch only extreme values (very low number of stocks like 5 or 10) you are betting on luck. Because the top of that waves is very unstable and moves fast in unpredictable way it even can become not the top but the bottom at some time (imagine volcano figure for example, the alpha will be negative at the center, but positive on the edge). In addition if you don’t get positive alpha more likely you get negative alpha than zero alpha, because the average sea level is zero and if somebody gets positive alpha more likely you get it negative.

Regards, Yury.

tkp · October 26, 2015, 2:14pm

Yury, even I understand the last post At least I think so.

Юра, ты жжешь! Пиши еще!

Jrinne · October 26, 2015, 3:07pm

Yury,

I like the idea of using linear regressions. Regarding general market trends perhaps a little heteroskedasticity–as is evident in your scatter plot–could be forgiven. You work with what you got and find the best fit possible. Do you feel comfortable with your statistical data regarding how good the fit is given the obvious heteroskedasticity?

How did you deal with the non-stationary problem with your time-series? That can make things that have no correlation whatsoever look highly correlated as you know.

Just want to know how you tested for non-stationarity and how you dealt with it.

yupolv · October 26, 2015, 8:23pm

Konstantin, I edited the post, I was in a hurry as usual when wrote it for the first time. So it sounds more clear now. You will understand the fitting problem clearly, if your question about it (actually I didn’t understand your question:)

yupolv · October 26, 2015, 8:26pm

Jrinne, are you discussing MT models or stock specific models?
There is some distinction between it.

yupolv · October 26, 2015, 8:38pm

About heteroskedasticity in MT. To check this stuff you can use Spearman test for example. But it won’t help you. The answer and the problem at the same time lay in optimization. To get lower variance (decrease stochastic deviation) you can use factor screening procedure over time and maybe ranks. You ll get not linear dependency changing over time with very high R2. But more likely it wont work in future (fitting problem again).
So the main check is macro and micro economy logic and common sense. And it should go first before any optimization (at least it sets up the limits of you optimization)

Jrinne · October 26, 2015, 8:40pm

MT as far as the non-stationarity concern because to the time series. My understanding is that the problems with non-stationarity occur with time-series.

Your linear regression seemed to have some heteroskedasticity: the realized 12M return regression. You can comment for sure as to the type of data and your statistical conclusions from this data.

Thanks.

Jim

yupolv · October 26, 2015, 8:47pm

I didn’t chase the goal of maximizing R2 in this regression, it was shown just as example of more or less reliable MT systems in comparison to existing P123 systems.

tkp · October 26, 2015, 9:00pm

Yury, now it was my turn to edit my post.

Just keep going with your thoughts as I am curious about finale.
Thank you.

Jrinne · October 26, 2015, 9:08pm

You really just can’t use non-stationary data in a time series without adjusting. The examples in the text books show high Rs for data that in truth has no correlation whatsoever. A fairly recent Nobel Prize was awarded for techniques to deal with problem.

It has nothing to do with Maximizing R2.

I personally would not try any linear regressions for market timing or times series. I certainly would put no money into it or make any public claims. But that is just me at my level.

I like linear regressions that are similar to our rank performance test. I think this would be cross-sectional data. But I am becoming more aware that any statistical claims are questionable–including R values. This is due to the fact that the data is probably not a normal curve or i.i.d. (thanks Peter and SUpirate1081).

Best,

Jim

yupolv · October 26, 2015, 9:13pm

Константин, русский? Я думал тут никог нет.

yupolv · October 26, 2015, 9:14pm

deleted

yupolv · October 26, 2015, 9:23pm

Market as whole is a more or less stationary process in comparison to individual stocks performance, and that’s the main difference.
As I remember Markov process deals with non-stationary time-series.

Jrinne · October 26, 2015, 9:27pm

Then you know whether a random walk is stationary? Do you think the stock market is or isn’t a random walk?

You mean after you have corrected for any trend? You have to have done that: by definition of stationary.

yupolv · October 26, 2015, 9:38pm

As I remember stationary means stable first and second momentums for distribution, mean and variance. It was long time ago when I studied in university

yupolv · October 26, 2015, 9:42pm

So your question is what model to imply for MT? Because my regression model is based on cross sectional rather than time-series data.

Jrinne · October 26, 2015, 9:42pm

Yury,

I actually like what you are doing. Personally, I would refresh my memory before going much further with any time series data.

I will be doing some of this myself but not for market timing. But again that is just me at my level. The overall market is almost certainly non-stationary even if it is not a random walk: any trending (at a minimum) must be corrected for: personally I cannot do that.

Good luck.

Jim