Screen Momentum Testing

I am finally at a point to share some preliminary findings with the rest of the P123 community. I have referenced this project a few times on other threads, but basically what I am trying to do is figure out if a screen that is doing well (has momentum) will continue to do well. We could use this to potentially predict the best screen to use for our next investment. I have attached a few files that summarize the data and explain the project in more detail.

First I developed, or really implemented, 30 screens from various academic and other sources and used the first 5 years of P123 data to optimize and test. I then evaluated those screens going forward. The results on that side were not surprising. My optimized (curve fitted?) screens killed it (94% and 70% annualized on average with zero costs or slippage) the first 5 years and dropped way off after (65% and 31%). This is totally apples and oranges in terms of length of time, market conditions, economy, etc, but I had to start somewhere. I think we have all seen those simulations that look amazing then just fall off a cliff. This wasn’t really the point of my experiment, but I thought it would be interesting.

Next, I ran the 30 1 week and 30 4 week screens outside the initial 5 year period and looked at various periods of performance (2 weeks to 5 years) to see how predictive they would be for the next week or 4 weeks. The results indicate that there are certain periods that would give you a statistically significant chance to beat the average screen by a good amount. The correlation was very low at 6-8% or so, but its better than picking one of the screens at random. For 1 week testing, the annualized returns were 30-40% better (fantasy numbers to be sure given the inability to really trade that frequently and the lack of slippage or transaction costs, etc.). For 4 week test, the annualized returns were about 10-15% better.

I don’t know that any of this can be incorporated directly into Smart Alpha or not, but it seams to have some real world potential.

As I said, the details are attached if you want to dig in. Please do reply with feedback and I would be happy to answer questions.


1 Week Screens Dynamic as of 3-28-16.pdf (369 KB)


4 Week Screens Dynamic as of 4-4-16.pdf (327 KB)


Backtesting Detail Summary.pdf (74.7 KB)

Thank you for sharing your findings. I did some work on this a while back and although it was more anecdotal, I came up with similar results. Given the turnover and associated slippage/transaction costs I did not pursue it further. I felt(rightly or wrongly?) that I could achieve better results with individual models. You have given me food for thought and I may revisit it again.

Cheers,

Brian

Mike,

Thanks for sharing.

I would be interested in seeing a regression between the annual excess returns of the in-sample results against the annual excess returns of out-of-sample results.

Generally, P123 wants to de-emphasize the in-sample backtest–which is good. However, if there was a moderate (not high or low) correlation it could both contribute to realistic expectations and at the same time show that there is some value to a well performing sim (preferably base on the DDM)–validating what we all do.

I wonder if the regression to the mean is not reasonably quantifiable. If, on the other hand, it turns out there is a negative correlation between the best performing (possibly most optimized) sim and the port of the sim that would be a lesson for all of us.

Mike

Thank you for sharing. You are right to be worried about over-fitting. That certainly is a road well traveled in this community.

I have tested momentum to a considerable extent. Probably everyone in this community has. My conclusion is that “big things have momentum, small things don’t.” For example, if you look at the total earnings of the S&P 500, you will find that it can be used as a global market timer. If you look at the SPY, once again you can use it as a timer. If you look at sector ETFs, they do not work so well, either as timers or as stock picking rules. Using momentum on small individual stocks is largely futile. There may be some computers that use momentum with short time frames. But that is a different dimension of time from P123.

I took a look at your data. Frankly, your data is of little interest if you don’t post the rules in detail. There are 1000 “proprietary models” out there that are worthless. Don’t worry about someone stealing your secret sauce. The devil is in the details. If you really want to learn something about modeling, you should present the data and models openly. Believe me. Momentum has been looked at in detail. You might have a great idea, but you also might have ten mistakes that have been worked out before.

A good starting point is Fred Piard’s book “Quantitative Investing,” which discusses some ETF momentum models. Fred is one of the best modelers in the business, and he has been so generous as to openly share many of his ideas in is books.

Many of us would love to review your models if you’d like to share them.

Michael Sherrard

Jrinne,

When you say “I would be interested in seeing a regression between the annual excess returns of the in-sample results against the annual excess returns of out-of-sample results.”, what are you looking for? The first table of both attachments are in sample and the second table is out of sample. I probably should have labeled them better.

Michael,

There is no secret sauce. All of the screens I used are from academia or P123 ranking systems that are public and authored by others. I guess I could share them, but it seems like a giant pain in the butt to go through and make everything public. Plus I think it is of limited value. Maybe I could do it over time. Originally I did this analysis on the AAII screens and the results were very similar. Those are all complicated mutifactor models as well. The point of sharing this is that others should think about their own systems, the cyclicality of performance, and how they are developing the screens. I have read Quantitative Investing several times over and seen much more extensive work done on momentum. Most data indicates that 13 week to 52 week momentum is the sweet spot for price momentum. After 52 weeks it falls way off. The first cut of this work on multi-factor models indicates a 3-4 year sweet spot, and I think that is the big takeaway.

Mike

Mike,

I meant matched in a scatter-plot. I did not see how to match the in-sample and the out-of-sample results to form one single point for each individual screening method–with the x-value being the in-sample and the y-value being the out-of-sample.

I am probably missing something: this data may be there. I think this is something separate from what you are doing and you may not be interested: please do not bother if it does not interest you or if it is difficult. I appreciate what you have done and shown us!

Thank you.

Regards,

Jim

Jim,

Now I get it. Very easy to put together. See the attached.

Mike


1 Week Screens In vs Out as of 3-28-16.pdf (71.7 KB)


4 Week Screens In vs Out as of 4-18-16.pdf (71.9 KB)

Mike,

Thank You!

Personally, I think that is impressive. I think those 2 graphs speak volumes. First, they say you are putting together good systems.

But also, for experienced members like you, backtesting clearly has something to say about what the future results are likely to be. You proved it in spades!!!

Interesting that your 4 week screens have a higher (and very high I think) R^2 compared to the 1 week screens.

So for your 4 week systems, someone looking at one of your sims could predict that the port might have a return that is about 80% of the sim. This would be the smart prediction for a similar sim not in this data set.

One can quantitate how much regression-to-the-mean your systems have in general.

I understand Mike is just one developer and that his out-of-sample results were historical. But P123 could do this for multiple developers and the results would be out-of-sample and prospective.

P123 should do this for their SmartAlpha. They should look at the results first and see if they like them; see if they want to make them public at all. If they are good (but not so good that they give unrealistic expectations) they might even want to advertise them.

I appreciate it!

Regards,

Jim