Practical Use of P123 Sector Themes

I’ve been experimenting with the P123 sector themes, or groupings of GICS sectors and industries:

https://www.portfolio123.com/doc/side_help_item.jsp?id=37

I have applied one of my ranking systems (similar to the P123 QVGS system) to a custom universe (US All Fundamentals with liquidity minimums, including ADRs, remove utilities, etc).

I then also applied the same ranking system to the universe above, filtered for each of the sectors/industries in each theme.

Here are backtest results, 18 stocks, weekly rebalance, period from 01/01/1999 to today.

CAGR / Max drawdown / Sharpe / Gain to Pain ratio (# of positive months over # of negative months):

All Stocks: 60%, -45%, 1.87, 4.8
Macro Theme: 41%, -58%, 1.44, 3.25
Population Growth theme: 38%, -51, 1.45, 3.3
Special theme: 21%, -73%, 0.75, 1.83
Financial theme: 18%, -69%, 0.98, 2.35
Innovation theme: 53%, -52%, 1.47, 3.5

While the “all stocks” sim shows the greatest performance, it would still be helpful to see if manipulating the themes, or weighting stocks within a given theme, could improve performance even further, or reduce volatility, etc.

I created a book of the 5 themes simulations above, equally balanced, then weighed the special and financial themes lower than the other 3. In all cases I could not improve on the “all stocks” simulation.

I would like to see if/how anyone else has worked with themes. Any wisdom shared would be much appreciated!

Other ideas I’ve considered, but would be helpful to get everyone’s thoughts on if they’ve already gone down this road:
• Adjust the ranking system for each of the themes (at the risk of overoptimizing), then use in a book
• Is it possible at all to “theme” rotate, like sector rotation with ETFs? If so, how?
• Weight themes somehow in the ranking system, or simulation?

Any thoughts or assistance would be greatly appreciated!

Cheers,
Ryan

I have experimented extensively along these lines–with themes, with sectors, and with “clusters” of my own devising (see https://seekingalpha.com/article/4243233-sector-classification-wrong ).

I have found two ways to use the segmentation practically, which are along the same lines as those you’re thinking about.

First, create a node in the ranking system that favors some industries over others, depending on how well they react to your system. But it’s super important to adjust for overall performance of the industry/sector/theme/cluster. If a particular industry has performed very well overall, that doesn’t mean it will continue to do so. You want to invest most highly in those industries whose performance in your system exceeds their overall performance. This may mean investing more in industries with poor past performance, which will lower your backtest results.

Second, develop individualized ranking systems for each theme/sector/cluster, and then, using Excel, combine those ranks with the overall rank of your system. That’s what I do myself. It’s terribly time-consuming, though. And there’s no way to simulate it since you’d have to use at least six different universes with different ranking systems at once. The closest I got to simulating it was to do the following. I ran nine simulations, one for each cluster, for the very top ranked stocks in each cluster according to my individualized ranking system. I then ran a simulation that bought and sold the stocks in those nine “portfolios.” I did the same for the not-quite-so-highly ranked stocks in each cluster and ran another simulation that bought and sold those stocks. Then, using formula weight rebalancing, in a new simulation, I assigned weights according to stocks whether they were in the top portfolio, the second portfolio, or neither (negative weights), and added to those weights another weight based on the rank position of my main ranking formula. The result was good, if a little imprecise.

Here my findings: I trade only microcaps and if you exclude OTCs (e.g. I use the universe non OTC), China as a Country and if you exclude financials Performance goes up.

Also, if you time the market (Quad 1: GDP going up, Inflation going down, Quad2: GdP going up, Inflation going up, Quad 3 GDP going down, Inflation going up, Quad 4: both go down) and Change your sector (or exclude sectors that do not do well in the market Regime we are in at a given time, that is what I do) you impprove you Overall Performance very well!

The Timing is done by hedgeye, which I can 100% recomend, their calls are very good (not a 100% but they correct fast based on nearcasting by constantly incoorperating new data).


by the way, we are in quad3 now.
What is not on the Picture: Big Cap secular growth outperformes (reason, why we are heading to new all time highs) too,
small caps have a hard time in quad3 (reason why I could not outperform the nasdaq 100 the last 6 months).

Here are the calls from hedgeye that Iobserved the last 2 Years.

Quad 3 in Q1 / Q2 2016 (so they went Long Energy and stuff)
Quad 1 and 2 from Q2 - Q3 2018 → They where Long tech!
Then a very good call on Quad 4 in Q4 2018 → HealthCare outperformed!
Then they called a Quad 3 for Q1 2019.

Besides P123 this is the only Service I use and the imcooperation of their macro market Regime calls and sector Performance in Connection to market Regimes helped me a lot to get a better understanding of the market.

though it takes some time to get it, same like with p123, the learning curve is not easy but it is steep.
What I like About them: They are data driven and backtesting based.

Ryan,

Interesting Results. I think you may have discovered a pricing anomaly.

There appears to be a sixth theme that I’ll refer to as “none of the above” (nota). Nota should is all stocks less the five themes. Based on your results, it should outperform the six categories, including all stocks.

I believe a causal relationship explains the nota anomaly’s ability to hone in on securities that are more likely to be mispriced.

But before I go any further as to my beliefs, I’d like to verify. Can you try your system on nota and share your results? Can you shares these themes’ criteria so I can validate this myself?

Thx!

Delete

Andreas,
My SuperTimer currently still at 80%, meaning go for equity.
The iM-SuperTimer uses the iM-Stock Market Confidence Level (iM-SMC level) which comes from a combination of 15 unrelated market indicator models, updated weekly. According to backtests, a high probability for up-market conditions exists when the iM-SMC level is greater than 50%. About half of the 15 models run in P123, the others in excel.
https://imarketsignals.com/2019/the-im-supertimer-update-no-2/

I just constructed a type of SuperTimer running in P123. It uses a unique Stock|Bond Recognition System. See the performance of the most aggressive model below. This uses 4 ETFs only: SPY, UPRO, IEF, and UST. Turnover is moderate at about 500%.

BTW, the Total Market Value comes from an initial investment of $5,000.



Yuval,

I’ve read your cluster article, a great read. I’m planning to delve into this once I get the P123 themes sorted.

To add some context, let’s use the P123 QVGM rank system as an example. Let’s say the “optimum” ranking system weighting for the “all stocks” universe is 25%/25/25/25 for Quality, Value, Growth and Momentum. Each factor group (Q, V, G, M) has say 5-10 factors each.

Assume the P123 theme “population growth” is sensitive to “growth” factors, and we want to cater the above ranking system to the population growth universe. Overall the growth factors in the PG ranking system would have a higher weight given to growth factors. The weighting could then be 25/15/35/25 for example.

If you repeat this for each P123 theme, you end up 5 universes with 5 ranking systems.

I am trying to break out the outperforming components of the overall ranking system. Speaking purely in terms of rank performance in ventiles, some of the themes do better than “all stocks”. When I transfer over to a simulation however, I’ve yet to see that outperformance. I’ve tried different combinations of # of stocks, sell points, etc.

The stats listed in my original post are for sims. Here are my top performing ventiles for “all stocks” compared to each theme and the respective optimized ranking system and universe (Jan 1999 – today, 4 week rebalance):

All stocks: 39%
Macro: 39%
Population Growth: 44%
Special: 35%
Financial: 25%
Innovative: 44%

As noted in my initial post, if you create a sim for each of 5 themes, you can run as a book, but performance of the book is generally lower than if you just run a single ranking system on the “all stocks” universe.

For your first point:

“create a node in the ranking system that favors some industries over others, depending on how well they react to your system”

Is my example above of adjusting the growth over value what you’re referring to? Or are you adding a specific node for population growth?

"You want to invest most highly in those industries (or themes?) whose performance in your system exceeds their overall performance (in ranking system?). This may mean investing more in industries with poor past performance, which will lower your backtest results. "

By “investing most highly in those industries” above, are you referring to “industries” generically as “themes” as well, or industries within the theme?

The “all stocks” ranking system and sim takes the top performers of each theme, but uses no discretion. It could theoretically take all stocks from one theme at a given rebalance.

I would like to be able to test and control for the themes. “Special” for example produces the most volatile results in simulation and ranking performance, so I would like to limit this. But at the same time if I remove this theme from my “all stocks” (so “all stocks less special”) I achieve neither a higher return or less volatility.

Which leads to your second point.

“Second, develop individualized ranking systems for each theme/sector/cluster, and then, using Excel, combine those ranks with the overall rank of your system.”

Can you elaborate on this item, particularly what is being done in Excel? If we’ve developed individualized ranking systems for each theme as described above, I’m not quite sure what you mean by “combining those ranks with the overall rank of your system”. Is this not the same as using the “book” function to run 5 separate simulations (one per theme, with individualized ranking system and universe), investing in say the top 5-10 stocks of each (or at any other weighting per theme sim)?

And your last point:

“The closest I got to simulating it was to do the following. I ran nine simulations, one for each cluster, for the very top ranked stocks in each cluster according to my individualized ranking system. I then ran a simulation that bought and sold the stocks in those nine “portfolios.” I did the same for the not-quite-so-highly ranked stocks in each cluster and ran another simulation that bought and sold those stocks. Then, using formula weight rebalancing, in a new simulation, I assigned weights according to stocks whether they were in the top portfolio, the second portfolio, or neither (negative weights), and added to those weights another weight based on the rank position of my main ranking formula. The result was good, if a little imprecise.”

I will try this. For the common simulations that buy stocks of the 5 themes, I have a question/clarification on the “portfolio” buy command. I use portfolio(“macro”)=true etc for each simulation as 5 separate buy rule, however the sim is not buying any stocks. Any clues?

Thanks,
Ryan

David,

Interesting thoughts. As it turns out, with the breakdown I’m using there is no “NOTA” with the P123 themes compared to All Stocks. At last check, # of stocks per universe (w/ basic liquidity rules, removing stale statements, etc):

All stocks: 3359
Macro: 964 (29%)
Population Growth: 593 (18%)
Special: 313 (9%)
Financial: 596 (18%)
Innovative: 925 (28%)

My GICS codes used for each:
Macro:
gics(151010,151020,151030,151050,15104010,15104020,15104050,201020,201030,201040,201050,201060,201070,2020,25,203010,203030,203040,203050)=1

Pop Growth gics(30,3510,50,55)=1

Special gics(10,15104030,15104040,15104045,203020,201010)=1

Financial gics(4040,4010,4020,4030)=1 & gics(601010,402040,404020)=0 (removed REITS)

Innovative gics(352010,352020,352030,4520,4530,4510)=1

I’m curious to hear your thoughts on the NOTA.

Thanks judge! I’ll dig deeper into hedgeye.

This is what I would expect, because now you’re buying equal amounts of each theme, which isn’t necessarily a good idea.

Neither. Create a node called, say, themerank, with a formula like this: eval(gics(xxx,xxx,xxx)=1, 5, eval(gics(xxx,xxx,xxx)=1, 4, eval(gics(xxx,xxx,xxx)=1, 3, 2))), where xxx is the gics code. Basically, assign scores to each theme (or sector, or cluster, or group industries yourself).

Either one.

Let’s say a stock has a rank of 99 in your overall ranking and 97 in your theme ranking. Your average rank might then by 98. Another stock ranks 100 in your overall ranking and only 94 in your theme ranking, so its average rank might be 96. A third stock might be 100 in your theme ranking but only 80 in your overall ranking, so its average rank would be 90. You then buy the highest ranked stocks and sell the lowest ranked ones. This is an extremely simplified version of what I do, with the overall ranking weighted more heavily than the cluster ranking, and with different adjustments for each.

You need to either use OR commands or group them in one command like so: portfolio(“macro,special,financial”) = true

Thanks for clarifying the 2nd approach Yuval, now I see how to combine the various ranks in Excel. Agreed, this could be tricky to test.

For the 1st approach, just to clarify a couple of things:

I’ve been playing with this but not sure how/where the score node is applied in the ranking system. If the themerank is applied to your overall ranking system, if it is a standalone node, where does the score of say, 5, 4, 3, 2, etc get applied? Is the score applied to each factor to each theme/sector? Or the score is an overall score on the entire system? If you have a simple public ranking system showing this that would be most helpful!

Which leads to me to clarify your other point:

Overall performance being say the performance of the entire theme (all stocks in theme universe), outside of the ranking system, for example, buy and hold all stocks in the theme, 52 week price appreciation? On that basis, this means that say the Macro theme has done on average 15% per year since 1999. In the ranking system, say the Macro theme has done 25% per year since 1999. Macro clearly outperforms in the ranking system, more weight should be given to stocks in that theme. Correct?

Not sure what you mean by “investing more in industries with poor past performance”?

Thanks in advance for all of the insight.

Yea tricky, Is there the potential problem of information leakage?

After all the clusters are based on returns or correlation of returns. Then you go back (using that information in the backtest) to look at the returns (which you have used to construct the backtest).

Looks like Yuval tries to normalize the returns for the categories—perhaps recognizing a potential problem with information leakage. Is this enough?

I will assume that it enough to mitigate the problem of information leakage. But I do not see how you could call it PIT anymore.

PIT (point-in-time) data is a basic principle and is the best insurance against information leakage. There are numerous examples in the literature of serious problems with information leakage due to seemingly innocuous things–like normalizing all of the data first.

P123 needs to adhere to the principle of using only PIT data to the extent that this is possible with the data Capital IQ (and others) provides… What people want do with that data in their own spreadsheets for their personal use is all fine by me.

-Jim

In your ranking system are a number of nodes. ONE of those nodes will look like what I wrote. That node could be called “industry rank,” for instance.

Let’s say your ranking system without this node tends to bring up a lot of banks as potential buys, and not many staples at all. If you assign banks a score of 1 and staples a score of 5, and then give this node a weight of 5% (subtracting 5% from other weights in your ranking system), you’re going to see a few more staples and a few less banks showing up in your transactions.

You’re onto it. Let’s say that using your ranking system the Macro theme does 25% and the Financials theme does 20%. And let’s say that without using your ranking system, the Macro theme does 15% and the Financials theme does 0%. Your ranking system improves Financials by 20% but improves Macro by only 10%. So you’d want to more heavily weight the Financials theme (give it a higher score). However, because with your ranking system Financials underperform Macro, your backtest will suffer. The idea here is that one theme would not by its very nature have a better chance of doing well in the future (in the overall picture). So you want to overweight themes that your ranking system IMPROVES rather than themes with strong results.

Jim, you’re absolutely right, and I want to thank you for pointing it out to me. Using the cluster approach is indeed NOT P.I.T., and I should therefore rethink it to some extent. I wish this had occurred to me earlier.

Yuval,

I want to thank you.

First, that clustering and K-means stuff is awesome.

Second, you got me thinking about the use of even and odd universes.

Even/Odd universes is often used for out-of-sample validation or testing (depending how you use it). Great for data that is IID (and stationary). But should I have been using it? Maybe for some uses (e.g., finding irreducible noise assuming a single target function).

Speaking strictly for what I was doing, perhaps I should have been using OUT-OF-TIME validation and testing for most of what I was doing. Walk forward validation being one example of out-of-time validation/testing.

My concern being that there could be information leakage with what I was doing.

I won’t go in to this in any depth about this now but I would love to get people’s thoughts on this in the future. My point is the question—let alone any answers–on this never would have come up without your posts and this discussion in particular.

Much appreciated.

-Jim

Yuval,

Got it now, thanks. The score is actually a node weight applied to all respective industries/themes/sectors, etc.

Thanks for clarifying. The relative performance over absolute performance is counter-intuitive, I’ll have to delve into this more.

Jim & Yuval:

Very good point for the clusters. For the P123 themes however, which are grouped more on inspection and is somewhat subjective, I don’t believe the above would be an issue (at least not as much as the correlated returns of clusters). Thoughts?

Taking Jim’s point to heart, I decided to use the industry returns from Ken French’s website in order to classify industries according to the correlation of their returns over the two decades BEFORE P123’s data begins–in other words, from 1979 to 1999. I also improved my k-means++ clustering algorithm to be completely non-discretionary. Here are the results. Ten smaller industries were outliers, pretty uncorrelated with the others, so they each went into the group with which they had the highest average correlation; as a result there are a few truly ill fits. I’ve added asterisks by those. The names of the new clusters are mine, but they could use improvement.

Addict: Beer, Drugs
Commerce: Agric*, Hlth, MedEq, Ships*, PerSv, BusSv, Whlsl, RlEst*, Fin
Energy: Oil
Tech: Toys, Auto, Aero, Guns*, HardW, SoftW, Chips, LabEq
Materials: Books, Clths, Chems, Rubbr, Txtls, BldMt, Cnstr, Steel*, Mach, Gold*, Mines*, Coal*, Paper
Safe: Food, Smoke*, Util, Telcm, Banks, Insur
Consume: Soda*, Fun, Hshld, ElcEq, Boxes*, Trans, Rtail, Meals

You can read the industry definitions here:

https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/Siccodes48.zip

Good luck fitting today’s GICS codes with these old SIC codes! I didn’t even include two of the industries (FabPr and Other), as they just don’t fit with GICS classification at all.

One last note: the fewer the number of companies in an industry, the more likely it is to be an outlier in terms of correlation of returns. This, I think, may be the worst flaw in this approach.

Yuval,

Awesome!

-Jim

Seems there is a way to automate the process of blending in theme/industry specific ranks into a Book comprised of theme/industry universe specific portfolios with their own ranking. Then “Portfolio” function to make sure the portfolio holdings match up against a “master portfolio” with the universal rank. That would give you the chance to backtest. I dunno, thinking out loud here…