Identifying negative interactions

All,

Most of our factors interact.

Does momentum work better with growth stocks or with value stocks?

If a stock ranks high on Value, Quality, Growth and Momentum is it a good stock or have you just diluted everything? Found a stock that is bound to be mediocre? Not that I know the answer to this.

The EASIEST way to find a system that IS GUARANTEED TO WORK would be to find some factors that are proven to work out-of-sample. And if those factor did not interact just pick a stock that has the greatest number of positive factors that you have identified.

I have been thinking about this in the context of Naive Bayes. This sounds complex but the main point is that Naive Bayes assumes there are no interactions among factors.

Examples of methods looking to manage (and benefit from) interactions include: endless sims, spreadsheets for multiple universes (as some do), Random Forests, most ML/AI in general (which I have done).

In other words, most of the things most of us do at P123. But, as I said, this does not apply to Naive Bayes.

I think the surest way to get out of the P123 overfitting trap is to switch to a classification problem, and find a few factors that are independent and proven out-of-sample. Buy stocks with those factors. Simple.

You could use Naive Bayes, to be fancy, but if the factors were few-enough you would just find stocks with all of the factors you have identified as proven to work out-of-sample.

The trick is being sure that the factors do not interact. Or if they do, that they do not interact in a negative way. But it would be NAIVE (as the name Naive Bayes implies) to assume there are no interactions.

So how can these interactions be managed? How do you identify negative interactions? Would it be as simple as removing factors that are negatively correlated? Not quite I think, but that might be a start.

Any thoughts appreciated.

Jim

Very few factors don’t interact with each other.

The clearest way to identify factors that interact or don’t interact is to look at the root of the factors. All fundamental factors interact with each other: they’re all products of the way accountants parse the company’s performance in the quarterly and annual statements, and changing just one item will change many others. On the other hand, while estimates are clearly closely related to fundamentals (earnings estimates are related to earnings), short-term changes in estimates are unrelated to fundamentals as long as they happen outside the release of the latest fundamentals. Price changes (momentum, technical factors) are very obviously related to fundamentals and estimate changes. Volatility is, as I’ve shown in a blog post, closely related to the stability of a company’s fundamentals. Industry-based factors are mostly unrelated to a company’s fundamentals and estimates, unless it has a large market share. But price changes are certainly affected by industry-based factors. Size-based factors are based on price, volume, and sometimes sales or assets, so those relationships should be taken into account.

I think this kind of analysis must be done prior to any sort of statistical analysis (correlation, clustering).

If you’re looking for factors that clearly do not interact, I would start by classifying factors as follows: based on earnings revisions outside announcement periods; based on industry (e.g. industry momentum, industry asset growth, etc.); and company fundamentals that have nothing to do with price (growth, quality). That gives you three distinct categories of factors that are extremely unlikely to interact with each other.

Price-based factors (momentum, technical, value) will interact with all other factors, so those would have to be excluded.

Yuval,

Thank you. IMHO exactly!!!

And in fact, thank you very much for confirming some of my thinking may not be too-off-base.

I would have said the same. That estimates are not correlated with fundamentals too much. That one could add estimates revisions to growth or value without much interaction.

You have some nuanced ideas on this e.g., “…short-term changes in estimates are unrelated to fundamentals as long as they happen outside the release of the latest fundamentals.” Obviously true and something I want to think about.

Very interesting and something I might use: “Industry-based factors are mostly unrelated to a company’s fundamentals and estimates, unless it has a large market share.”

And just yes:

FWIW. For now, I am using basically 2 classifications. One based on earnings estimates revisions and and one based on fundamentals. With some insider-trading thrown in. Knowing that the insider trading is probably correlated with one or both of the first.

And now that you mention it, earnings estimates would be correlated around the time of announcement.

A lot of good points. I am missing more than a few good points at this time, I am sure. But I will be spending some time on this.

For the future, you mention clustering. One could (and I have) use something like PCA to make factors independent. But I question this as factor interdependence may be constantly changing (not that I have any evidence on this either way).

Appreciated.

Jim

I agree with both of you. The above is why the investment problem is a non stationarity problem, right?

RT,

I think that is right. For sure it is non-stationary and certainly changing interactions contribute. I agree with that.

Jim

fwiw, I had to discard low price volatility factor when building models seeking more growthy companies. (Not sure if most consider lowvol a factor, but I do). I guess I can’t speak to a process to uncover than noting the kinds of growth companies I was trying to get the model to select did not benefit from ranking based on low price volatility.

I guess other thing that pops to mind is how a negative equity approach seemed to have something completely different going on. Didn’t seem to play well when mixed with many/most factors except ability to service the debt. Makes it an interesting approach but alot of benefit of quant seems to be marginal gain from factor stacking.