Changing models over time

I’ve been running models I created for the last year and invest in 50 stocks. My book is beating the market by a significant amount over the last year and I’ve beat the market in 10 of the last 12 months. To develop these models I looked at variables that have been working over the last two years. I’ve seen that variables don’t always work in prior time periods but they’ve been working really well over the last two to three years. The models are run with mid cap companies so there is plenty of liquidity. I rebalance every four weeks and have about 400% turnover. Through my testing I’ve found that models in this range of companies can produce consistent excess returns. I chose to concentrate on developing buy and sell rules instead of focusing on the ranking system. I spent a lot of time determining which combination of variables work well. However, generally I’ve found that if the top four deciles of a variable work by themself then it will improve a model that includes a few deciles of another variable. I expect that I’ll need to change these models at some point. However, when I look at the individual variables I’m not seeing that the market is changing yet. During the last quarter of 2018 the models bounced around with the market but since the new year have done really well. I just wanted to share an approach that has worked for me and see if others have used a similar approach or have ideas on ways to improve the models. Also, I’m wondering if any of you anticipate changing your models over time to reflect what’s currently working in the market. It seems like a lot of the discussion in these forumns talks about finding variables that work for longer periods of time. Frankly, I’m not sure I care if it worked 10 years ago but Maybe I’m missing something. One other thing, I completely agree with others that the fundamental variables should be used. While price related variables can produce good returns at times the fundamental variables produce much more consistent returns over time for this size range of companies. I have found that price related variables generally can work better for smaller companies than for larger companies.

Erik

Congratulations!

In my opinion–and my research bears this out–you’ll have a much more reliable model if you look at what’s been working well over the last eight or more years.

I think this is a mistake. You’re basically screening rather than ranking. I’ve written an article on the difference here: https://backland.typepad.com/investigations/2017/06/-the-paradox-of-stock-screeners.html or The Paradox Of Stock Screening | Seeking Alpha In my opinion, and in the various experiments I’ve conducted, ranking is almost always preferable to screening.

I do like tinkering with my systems and try to keep up with what’s working in the market. But the question is one of persistence. Let’s say you develop twenty different models. Some of them have done well for eight years, and some have done well for only two. The ones that have done well for eight years are more likely to continue to do well than those that have done well for two.

I have found the opposite! I’ve found that momentum works better as a factor for large than for small stocks. But that’s a quibble. It depends on how you measure things and what tools you use.

Hi Yuval,

Good stuff.

I am interested in you present thinking about the 2000 to 2004 period.

I remember both you and Marc commenting about this period being different. I forget which of you said what exactly which is okay because I am just interested in you present thinking.

I am involved in a project outside of P123 where the data is harder to get and I am thinking it is not worth getting and could even be misleading for this project. But I would be interested in you thoughts.

Thank you in advance.

-Jim

I wasn’t actually investing during that period, but from the backtests I’ve run, it seems like kind of a golden age. You could make a lot of money simply by investing in cheap small caps. I still invest in cheap small caps, but I have to be very careful. Back then, it seems like you didn’t have to be careful. Cheap small caps just made money. But, like I said, I wasn’t investing back then, so Marc might have a better answer.

Thanks for the feedback. I’ve learned a lot from you and the others here. Just to clarify, I have been using several of the ranking systems such as Fiscal Momentum, Sentiment, Basic Value, etc. Are you saying that I should be relying almost solely on one ranking system or would you say that using a combination of several ranking systems would be a good approach? I use ranges of each of them in combination along with some variables by themselves. Additionally, I had mentioned that I didn’t find price related variables very predictive. I have found fiscal momentum, sentiment, and a few of the volume and EMA variables to be predictive. I haven’t found recent changes in price predictive such as four week relative price change.

Thanks again.

I think combining ranking systems is definitely a good approach. I’m very fond of the sentiment ranking system myself, but they all have stuff that will inspire you, and they all can be fruitfully tinkered with.

What are some ways to combine ranking systems? Do you combine the various factors from multiple ranking systems into one, creating a top-level Composite node for each ranking system? Or do you keep the ranking systems separate but mandate a minimum threshold in the Buy Rules?

Thanks - always looking to expand my P123 knowledge.

I don’t know how most people do it, but my normal approach is to combine various factors (copy-paste composite nodes) into a single model. This “usually” works better for me than restricting buy rules. There are certainly a few cases, however, where limited buy rules did better than any ranking system I could devise. (I’m thinking specifically of a case where I was working on highly indebted companies). My default starting universe usually incorporates minimum constraints that might otherwise go into buy rules (like minimum liquidity, mktcap, price, markets restrictions, etc).

To one of your points: I think I recall examples Marc Gerstein has posted where he’ll require a passing company to be above a minimum threshold of an individual factor like Quality or Momentum. This uses the “Rating()” function in the buy rules.

I’ve found it’s not always true that the best individual factor will work the best within a multifactor model. I think there’s value in finding the best Value, Quality, Momentum, or “Whatever” factor you can build. There are themes that will hold. However, strange things can happen when factors are combined, and certain things that work by themselves are not helpful in combination, and other things that are useless by themselves can work well in combination. So I tend to save ideas within my factor models - even if I give them 0% weight initially - and go back to them from time to time to see the result in combination with something else. I recently worked on a sector specific model (Utilities) and it required a lot of backtracking and testing out basic ideas from the ground up. Some things apply to that specific sector that I’d not used anywhere else. I’d guess that general rule is true in many cases.

Depending on the situation, I’ve also found screen-of-screens to be effective depending on the situation. Ex: Screen(“Screen1”,30) AND Screen(“Screen2”,30) where Screen1 and Screen2 are utilizing different, but similar, ranking systems. For whatever reason, sometimes I’ve found screen of screens of similar approaches are sometimes able to filter better than an individual ranking system. Not always - but in some cases I’ve been unable to get a single ranking system to outperform screen-of-screens. Unsure why - I wonder if the duplication of similar idea in different ranking systems might be in some way be reducing noise? Not sure, just a stray thought. Have to be careful when doing this because the reduced # passing companies will usually increase the std dev of result making it difficult to compare even controlling for average # of holdings, but I think there might be something to it. Screen-of-Screens can’t be put in the simulator though, so they have to be monitored manually.