How I Rank R2G Ports

I seem to have different ideas about what is important to me than many of the Pros at P123 so I thought I would summarize my approach to ranking the R2G Ports.

First, I want to invest across all the MktCap ranges so I turn off the Buy Daily Average factor. That way I can review the highest ranked stocks in all MktCap ranges.
Second, I like Alpha (who doesn’t) and I prefer Sortino over Sharp so I set Alpha and Sortino to higher is better.
Next, since I don’t prefer high drawdowns I set Max Drawdown to lower is better.
Next, I set Average Return Winners and Winners % to higher is better, and Average Return Losers to lower is better. I look for winners/losers to be > 2, and Winners % > 60.
Next, I set Annualized Inception % to higher is better and I set Cost to lower is better.

All other factors are turned off. Maybe in a few years Annualized Launched % may be meaningful, but with only 6 months since R2G became available I don’t see any value there and it may keep some of the best Ports from rising to the top. I would like an option to remove all Ports that are full from the list, or move them to the bottom.

From the above ranking a few of my favorite Ports rise to the top. They are (not counting the ones that are full):

TWY 5 Stks Mktcap>$100M Liquidity>$1M VIX Mkt Timed

TWY 5 Stks with Mktcap > $100M Liquidity > $1M

Porfolio Series: PSMVM- Small-Micro-MidCap Value+Momen. ~300%Turnover

J3th - Liquidity > $800k, 5 stocks

Best(SPY-SH) Gains for Up & Down Markets

Alpha Max - Large Cap (S&P500) 10 Liquid Stocks w/ Improved Metrics-V4

I really like “Don’s Classic SC, Liquidity > 750K” but it doesn’t quite meet my desire of winners/losers > 2 or winners % > 60.

All of the rest are either too expensive, full, or don’t meet my desired criteria. (Except for my 2 largest MktCap Ports, they meet my criteria also, but I am already invested in them :slight_smile:

I would like to hear your thoughts,
Denny :sunglasses:

Well, I think for me the most important question is, how does it perform “out of sample”, i.e. without the benefit of hindsight?

Also, is it just a gambling system that has managed to fool by simply predicting a few outcomes in a row?

I suggest a three factor rank, all “higher is better”

  1. Annualized Excess Return Since Launch

The period after the launch date is of prime interest, as this shows how the model performed when it was outside the control of the designer.

  1. Days since launch

Obviously, the higher the better. The more time you have to see how the model performs “in the wild” the better.

  1. Number of positions

Any portfolio with only 5 stocks may have just “got lucky”. However, the more the positions the harder it is to have a good result by luck alone. It’s called statistical sampling. A portfolio that has, say 20 or more positions is unlikely to be pure fluke, because probabilistic-ally, that is much less likely.

Oliver

Denny,

Clearly everyone is free to value systems as they choose.

For me, I am interested in ‘building a book’, not just ‘the best’ systems in stand alone. So:

  1. Liquidity.
  2. A ‘fair and reasonable’ assessment of total system capacity and real limits on sub’s that reflect likely system constraints. A real commitment to a hard cap on the number of sub’s the system will be offered to, that the system developer adheres to.
  3. Lack of overlap with other systems offered by the same developer and/or other developers where total ‘universe’ capacity may be an issue.
  4. 10-20 positions in a system strongly preferred. 10 if turnover is ‘higher’, up to 20 if it’s lower. Too easy to curve fit at 5 positions to me.
  5. Lower turnover preferred. Not a must for technical breakouts and other systems. But strongly preferred. Within a space, would usually pick lower turn over higher turn, unless the system disclosures justified the reasons for the higher being likely to be more robust.
  6. Significant disclosure of system testing and sensitivity analysis of various system parameters. Including disclosure with and without market timing, with and without hedging, at 5, 10 and 20 positions, even / odd testing, varying rules, disclosure of the number of ‘factors’ or degrees of freedom in a system, etc.
    [b]In fact, I am extreme. I place zero value on a sim with no disclosures of this testing.[/b]
  7. A clear understanding of the ‘return drivers’ so that I can build portfolios and books of systems that are not that likely to overlap. A clear ‘system focus’ that the designer understands and communicates. This one is key for me. Historical correlations are not enough. I want to know that the systems are doing very different things. And are unique. I am only likely to invest in something I don’t think I can build. Something conceptually different and unique and very robust in testing. Even if returns are much lower.
  8. Likely correlation in with other systems I have…particularly in times of serious ‘system’ market stress.
  9. Ideally an accurate ‘benchmark’, so that ‘alpha’ is meaningful. For many systems, this isn’t currently possible.
  10. Something ‘unique’ within the R2G offerings, so that it is less likely to be ‘fighting for’ the same stocks.
  11. Fewer sub’s, lower cost.
  12. A designer I know. Or with a ‘pedigree’ in some way. Real-world trading experience. Audited results WAY BIG. And/or at a min. a designer who has posted multiple models and very detailed disclosures.
  13. The ‘family test.’ Would I ask my parents or relatives to put money in this?
  14. How much can I put in? The more money I can invest in a system and the ‘more different it is’, the more it’s worth to me. A system unlike any that I have, that I can put $150,000 in, is worth a lot more than one similar to what I have, that can only take $20k. Even if the returns, win rate, ‘alpha’ and sortino are lower. A zero correlation, 20 ETF system with high capacity and 20% AR is worth way more to me than ‘another’ 50% microcap system.
  15. Consistency of performance. Similar St.dev and returns across most samples of ‘randomly chosen’ subperiods.
  16. DD within reason. For me…this is a piece. Lower is better, but I am very skeptical of ‘miracle’ DD’s…systems doing 80% with 15% DD’s. Some effort at DD control. But disclosure of rough idea of how system does without it.
  17. Coinvestment from the designer.

For me 60% winners or 2/1 win loss doesn’t matter. They are very arbitrary. It’s nice to have high numbers, but I am aware of many pro’s who have added in 40% win rate systems so long as the win/loss amount of profitability makes sense. I have many systems I trade and am very confident in that don’t meet these criteria, but they add something to a portfolio.

I don’t change ‘sell rules’ to try to get them here usually. I just try to trade a ‘more pure’ version of the system.

As far as ‘out of sample return’…Return out of sample may make me feel better, but will be really meaningless for several years. And most systems lack adequate real ‘style’ benchmarks to gauge how they are doing at all. I might invest in a large cap growth system for example that had strong diversification and risk controls and was purely growth focused. Because I don’t have one. Or a very strong, Asian Markets Long-short etf system that showed serious robustness and beat an asian market equity benchmark. Or an emerging markets ETF system that had 10 years of history and beat it’s bench. But these benchmarks don’t exist.

Point is…I don’t want my systems doing the same thing…or in the same spaces.

There are many systems I like. None that I like enough to put money in yet. But, many of the builders have done a good job. I think Steve (Stitts) has done a very good job of disclosing testing and sensitivity to assumptions. That’s what drew me to his work. I think Georg Vrba has put up many great looking sim’s. If his new releases of timing systems hold out of sample for 2 years, they will likely be full, but will prove to be very valuable to a portfolio and book as they give ‘short signals’ in addition to long and have huge system capacity. They don’t need to be the best on an absolute basis to add a lot of value in a book. I trust and like Don and his work. And Wuu Yean. He’s put up many great looking systems and been doing this a long time. I just feel that they may be too similar and may have too many sub’s for my tastes. And Aurelaurel has put up several very good systems. As has Shiguang. But…I think many of these systems are in a very crowded space in terms of liquidity and ‘breakouts’ and would worry about system overlap. But they seem to be building good systems. Kurtis Hemmerling also builds great looking sims. But too many subs and no disclosure of testing. So unlikely to invest on my end.

I don’t think I would sign up for a microcap system from any other developer. I am okay with what I’ve built.

I am also very comfortable with the systems I have opened so far (only 2 to date). In fact, I am downright proud of them. I plan to release 1-2 new ones a month or so, so I can see how people react and learn from each release. I think I offer fair value. The market seems to agree on one, not on the other. I may lower this price as a result. We’ll see.

I think my market timing SP500 systems are very comparable with what Georg has put up. I trade mine, because I know what’s inside. I think my ‘breakout system’ with $1MM liquidity is a very strong system. I’m not sure I will release it yet, because at $1.2MM lower end liquidity and ‘only 6-7’ stocks, I worry it lacks real liquidity and my real world returns on my own trading will suffer. I also believe in the other stock systems I have posted, but will not release any if the ‘watchers’ are not high enough and nobody clicks on them. My 20 stock system has been very robust in testing - and is well diversified and liquid - at least as far as R2G’s go. Much more so than many other systems. And I think the disclosures on all my systems give you a real sense of what you may be signing on for. Although, nobody knows…

Sorry none of my systems work for you. But, I guess we value different things.

Best,
Tom

I will keep my list short as it is getting late. It is as follows:
(1) I’m only interested in “out of sample” only
(2) Low liquidity stocks are OK so long as there isn’t a market panic. But once that happens it becomes very difficult to dump them. (Many hedge funds fail simply due to lack of liquidity during a crisis.) Liquidity is important to me so I would discard any model with liquidity less than ~$5M/day. I’m being generous here as there can be many subscribers to an R2G model. If it was a private port for which I was the only trader I would be happy with $1M/day.
(3) I would toss out any model that has had multiple > 4% single day loss out of sample. In my opinion the markets have been very stable since R2G started. One high loss could be bad luck, but two or more is a disaster waiting to happen.
(4) I would toss out any system that has had a drawdown of more than 10% (excess) out of sample. Again the market has been very generous. There are some market segments that have taken a hit such as the DJIA and utilities so excess return is important.
(5) I like to see how all of the developer’s models have performed out of sample. And I like to see a variety of models. A developer who only generates low liquidity models is a one hit wonder. One who can make Large Cap, Mid Cap, and Small Cap models work out of sample is the equivalent of the Beatles. I am still in awe of Kurtis Hemmerling’s out of sample record. His models cover the full range of liquidity and seem to defy gravity. They don’t have large drawdowns. I’m sure there are others with really good out of sample records as well but I haven’t had the time to watch as closely as I was.

Steve

Hi Tom:

I don’t have a “breakout” of my own so I hope you will consider making yours available on R2G.

As a developer you don’t have to worry about subscribers affect liquidity. Just create a private port with identical settings to your R2G port and trade your private port on Friday. That will let you buy and sell 80% of the stocks your system picks before the R2G subscribers get to trade. As a potential subscriber, I’m fine withe the developer having a head start on the trades.

Brian

Brian,

Thanks for your interest. I believe that this type of trading ahead of subscribers by a day is front-running and is both illegal and unethical? Am I wrong? I know that it is definitely happening in the world of finance (and perhaps on this site), but I don’t intend to do it…at least not without full disclosure to sub’s and/or legal clarification.

If I am wrong about this, enlighten me. I’m just a private trader.

As far as breakouts, I am still reviewing this system around both liquidity and total number of stocks. I am not sure yet if I will open it. I may only open it with 10 stocks.

see:
http://en.wikipedia.org/wiki/Front_running

Best,
Tom

Many of the responses to my statements at the start of this thread list out of sample performance as the most important criteria.

OK, I agree that out of sample is one of the most important factors, but we don’t have any out of sample data that is anywhere near long enough to be meaningful. So what are potential subs to do? Wait 3 years before signing up for a R2G Port? That obviously won’t work. And what if the market is subpar during the 3 years and the Port underperforms its prior history? Is it the Port’s fault or the market’s fault?

So lacking meaningful out of sample data we are left with what we have. We have to evaluate the Ports on their face value, and the designer’s descriptions and testing of their system. If a Port doesn’t hold up out of sample, the subs will move to Ports with a better record. The subs will have no tie to developers who’s Ports don’t hold up. So I see NO reason to wait for out of sample data. So I can’t put much value in that most important factor, at least not yet.

I find it interesting that many of the responders to this thread that list out of sample data as most important also have R2G Ports with many subs and NO meaningful out of sample data for their ports. So out of sample data is most important for the R2G Port designers, but not for their subs? Obviously they don’t expect their subs to hold out for meaningful out of sample data before they sign up. That seems a little hypocritical.

I started this thread to try and come up with a simple list of things that a retail member of P123 can use to select good R2G Ports to follow. I am referring to those who are either struggling with developing their own private Sims, or just don’t have enough time, knowledge or inclination to do so.

So I have 2 new questions;

First, how many members who have developed R2G Ports also sub to Ports that are not their own? If so, what criteria did you use to choose the Ports you are following.

Second, since we don’t have any meaningful out of sample data yet, how do you recommend retail members (who are not designing R2G Ports) determine reasonable Ports to follow?

Denny :sunglasses:

Denny,

Many people will not agree with me. But this is what I would suggest is a family member of mine asked about using this site:

For retail investors who can’t build their own systems I would suggest:
BE CAREFUL!

  1. Limit the annual cost of the model to 1% of the amount to be invested.
  2. Don’t subscribe to anything that looks ‘too good to be true.’ You know what your gramma, momma or daddy taught you about ‘too good to be true.’
  3. Limit the total amount of money invested in any R2G’s to some ‘at risk’ portion of your total ‘balanced’ portfolio.
  4. Carefully consider the disclosures of the designer. They should be thorough.
  5. Don’t invest in anything with less than $5MM daily liquidity or a cap on sub’s of under 10, until 1-2 years has gone by.
  6. Don’t invest in anything with daily liquidity under $1MM no matter what.
  7. Know that as fixed annual costs of the system go up (higher turnover) and liquidity falls, the liklihood that you will lose money rises by a lot.
  8. Know that disclosed DD’s as currently exist are close to meaningless. And you can probably only withstand 1/2 the DD you think you can.
  9. Finally…If you can’t build your own systems and can’t understand the disclosures of people who do, stay away from all R2G’s until you can. Unless you personally know the designer. Just either find a good financial advisor and/or buy 2-5 broad market, low cost ETF’s - X % stock and Y% bond until you can understand the above. Weight them based on your risk tolerance. And rebalance annually.

I haven’t invested money in any other R2G system. I have considered it. I considered your small cap system initially (and signed up for several months, until I had a better handle on my own feelings on the liquidity issue), but never put $ in. I considered one or two others. Including your large cap system. Also signed up, but never put money in. Very hard for me to invest in a system I didn’t build and backtest.

Best,
Tom

“I find it interesting that many of the responders to this thread that list out of sample data as most important also have R2G Ports with many subs and NO meaningful out of sample data for their ports. So out of sample data is most important for the R2G Port designers, but not for their subs? Obviously they don’t expect their subs to hold out for meaningful out of sample data before they sign up. That seems a little hypocritical.”

Denny - I raised the original feature request for this (i.e. blind ports). One of the criteria I asked for up front was a clear separation between back-test data and “out of sample” performance. Ultimately, P123 came out with R2G but I have to say that I was extremely disappointed with the fact that the inception date is used for filters and 5 year performance for graphs as default. I’m not saying that back-test data should not be provided but that it should be made hard to get at, and definitely should not be used for model comparison by way of sorting and filtering.

SO lets say P123 were to take out all of the columns in the summary pages relating to model inception, deleted all of the filters, and set the graph defaults to “since launch” then what would happen?

Subscribers would be slow to sign up to any model. They would wait until they see how the performance is shaping up. They would review whatever the model developer puts out in the way of documentation. In short, they would be forced to consider the model providers’ reputation, capability, documentation and initial performance. Collective2 has shown that from launch, subscribers start trickling in after a few months, or if the performance is exceptional then they will jump on the system quite quickly. In fact, a person with your reputation (Denny) would probably have models fully subscribed immediately whether you had backtest data or not. I had people EMAILing me to tell me that one of your models would be opened to subscribers in an hour and that I should keep a lookout because it will be filled up immediately! So I don’t believe there would be a problem with lack of out of performance data. In fact, subscribers should be cautious but they can’t when models are filling up within an hour of being opened.

The biggest fear I have is that people will get sucked in to high flyers that don’t live up to their promise, they lose a lot of money and then not only exit the model but the site as well. P123’s reputation is in many ways tied to the credibility of the R2G models. So the fact that there may be good reliable models here might not matter in the next bear market as the baby gets thrown out with the bath water.

You can say that there is no “meaningful” out of sample data for ports but I have to argue that back-test data is not “meaningful”. Try starting a hedge fund based on backtest data. It won’t happen. I believe that Portfolio123 is unique in how back-test data is portrayed as “since inception” and this is not right. Many of these models could go for four years easily with no profit but still show a remarkable annualized return of 70% or more. For me at least, I’ll take six months out of sample over 14 years of back-test data any day.

“That seems a little hypocritical.”

Well perhaps this is true. I did develop models with a defensive posture and no market timing. But guess what? Subscribers went en-masse to the models with wonderful backtests.

We all have different life situations. I am not comfortably retired as you are. In fact I’m an aging engineer that has been unemployed for three years, with probably no hope for future employment. So this opportunity is important for me :slight_smile: I took a gamble by upgrading my membership and re-tooling all of my models that didn’t have subscribers. So far I am happy with the results and the fact that subscribers are expressing faith in me. Is this right? I don’t know but these are the cards that have been dealt and I have to play them.

Steve

Steve,

A lot of really good points.

I completely agree on your issue of P123 potentially losing credibility and all sub’s down the road. I have posted about this at length. I don’t think model builders should be short sighted. They will be losing a lot of money long-term by misleading people now.

Unless real standards are enforced in some way, this P123 experiment will end badly. I have posted about this way too often, but I would like to see R2G stick around. I would like the extra income, too. For doing a top-notch job. Not for anything other than that. Over time. Proving my worth. But that requires P123 to maintain and enforce professional level standards on a) testing, b) liquidity and c) disclosure. At a bear minimum.

I know of several funds that have raised money based on simulations. It’s hard to come by, but it does happen. But only when the people running it have a lot of money invested and good rep’s.

Best,
Tom

Hi Denny:

I’ve got three of my own Ports which generate 20-25 unique picks. Two of these are very robust and the third is robust but rather bouncy. Their annual returns are good (45-65%) without market timing. Adding market timing does not affect the annual return much, but it does reduce maximum draw downs from 30-55% down to 27-35%. These are all small cap Ports with the 20% lowest liquidity in the range of 120k-250k. This low liquid is OK for me (I trade on Fridays to avoid any R2G rush) but these ports would be no good to offer as R2G items due to the low liquidity.

I’m looking at subscribing to a half dozen R2G portfolios. In general my criteria is to get Ports that complement what I already have.

Larger Cap Portfolios
. The ranking system I have for my small caps does “work” for larger caps, but not nearly as well as some large cap R2G ports, so I’ve signed up for two larger cap portfolios. I haven’t put any money into their picks and likely will not for a few more years. I’m content to pay now so I’ll have a spot when I want it. If I get time to figure out my own large cap portfolios, then I’ll unsubscribe to the large cap R2G ports and save a few dollars. Neither of these two ports are fully subscribed so I’m not crowding anyone out who wants it (at least not yet).
. Which large cap R2G ports?
. Your MidCap-LargeCap is one and the other is by Stitts.
. I chose your Mid-Large Port instead of your pure LargeCap because the former has a slightly higher annual gain and the latter only has a bit more liquidity once I took into account the difference in maximum subscribers and number of stocks.
. I chose Stitts’ because it has nice annual gain and a hedging feature. I might or might not use the hedging feature - it all depends on how that part does “out of sample”.
. I also like Tom’s DX10 SP500/400 because of its super high liquidity (given current maximum subscribers - which the designer could change) and because it is built on the concept of “mean revision” which, as best I can tell, is not a major factor in the ones by your or Stitts. Also aurelaurel’s S&P 500 (5 stocks) is appealing because it is built on yet another concept (pull backs). To add these two would cost me nearly $1,000/year which would be fine if I were going to be putting money into them right away. If I don’t start to use them for 4 years, that would be $4,000. However, I might panic and sign up if either gets almost full. That will likely not happen for a long time for aurelaurel with its 500 subscriber limit, but Tom’s might get filled up within a few months.

Smaller Cap Portfolios
. I like my own small cap ports, but they are built on similar concepts. I’d be happier having more diversity of concepts. Also, sometimes my own ports pick duplicate stocks so that I don’t have enough picks to work with. My own generally generate between 20-25 unique picks, but sometimes it is fewer than 20. I’d really like to have my money into 30 or 40 stocks.

. So which smaller cap R2G ports?
. I developed my own liquidity metric so I could exclude R2G port that appear (to me) to be easily compromised by subscriber trading pressure. The ones that passed by liquidity metric where then examined for duplicates. I’ll tolerate 1 or 2 duplicates, but any more and I’m not interested. I’ve signed up for several ports so I can see the current holdings. Then I use Excel to find duplicates with my own ports (plus any R2G ports I decide to keep). I’ve donated a month’s subscription to several designers just to see their current holds for duplicate stocks. Here are the ports I’ve decided to keep along with a running count of unique stocks that each contributes to my “Book”.

. Unique
. Stocks__Ports
. 23______My own three ports
. 33______The above + Sherman’s Liquid Power + Don’s Classic (both great returns plus ok liquidity). The 15 stocks of these two ports had 5 duplications with my own, so the total unique picks is 33. I especially like the absence of market timing in Sherman’s since I believe back test stats are more robust without timing, other things being equal. I like Don’s liquidity (given the number of subs and 10 stocks). I’m trusting Don’s experience in the hopes market timing has not inflated the stats.
. 40______The above ports + TWY 10stocks adds 7 more unique stocks. This port’s annual returns are in the same league as mine own and the liquidity looks given the number of maximum subscribers.
. 47______The above ports + z8735’s Fundamental Rockets. Although this one costs less, has a slightly higher annual gain and does not appear to use market timing (and thus I consider its stats to be more robust), its liquidity is 1/2 of TWY 10. That is why I added TWY 10 to my collection first.
. 51______The above ports + Denny’s Small Cap. This has a bit better liquidity than TWY’s and z8735’s but its “out of sample” results sugggest it is currently out of favor with the market (but I expect it will come back beause its micro cap sister Port is doing very well out of sample).

The “weighted” average annual gains for the above collections of ports are 60, 74, 69, 66, 64, and 59 respectively for the above (a port with twice as many stocks gets twice the weight). So if I use all the 51 picks, I would get about the same gain as if I just used the 23 picks from my own three ports.

The cost for the above “book” of small cap R2G ports is $267/month. If I got ride of Sherman’s and TWY’s the cost would $188 less so those two ports will need to show they are earning their keep.

What do I get for my $267/month ($80/month if I drop the two most expensive)?
. I don’t get any extra gain since the complete “book” of 5 R2G ports plus my own 3 ports has a gain of 59% which is virtually identical to the 60% of my own three ports. If I went with a Book of the first two R2G ports plus my own, the gain would increase from 60% to 74%.
. Hopefully having twice the unique stock picks will be enable me to stay trading in the small cap world for an extra year or two before needing to move some of my funds up to large cap ports which have returns that are 10% to 30% lower than the small caps.

How to implement this Book of R2G and Private Ports-?
. It looks like I’ve be doing trades on Mondays and Fridays. The simplest would be to do all the trading on the same day but I reluctant to trade my private ports on Monday when all the R2G ports get traded. The R2G ports in my Book have very little overlap with my private ports, but I’ve rejected other R2G ports that have significant duplicate picks with my private ports. Also, I don’t want to delay trading the R2G ports for 4 days (I trade my own ports on Friday because I assume 4/5th of the time, my private ports will be ahead of the R2G ports that generate duplicate trades.
. I’ll be using Excel to keep track of, and exclude duplicates. I could use P123’s Book feature, but since I’ve got 5 R2G and 3 Private Ports, I’d have to upgrade to the “Manager” level so the book would have enough “asset” spots. That would be an increase of $1,000/year over my current level. If Excel becomes a too time consuming, I’ll upgrade.

Brian
.

We have little out of sample data so, for me at least, it is important to have robustness testing documented. Having standards for robustness testing would be even better. Another idea would be to give potential subscribers the ability to take a model for a test drive - provide something like the ‘book’ function to allow people to run the model with different parameters (dates, even/odd, market timing off, slippage, #holdings,etc.)

-Debbie

Brian,

An excellent post! But I expect as much from someone who has been using P123 for as long as you have. You have obviously figured out how to best use the tools for your needs.

I have a few questions. Have you compared your Friday buys/sells to the Monday open price? How similar are the stocks recommended on Friday compared to the stocks recommended on Monday? I am curious to know if the addition of Saturday fundamental data improves the selections on Monday over Friday. I did that a few years ago and my Sim tests implied that Mondays open had the best performance.

I have been checking all of the stock recommendations for my MicroCap R2G Port and, so far, there was only one stock that may have been affected by excessive R2G subs buys or sells. And that stock had excessive volume between 10:15 and 10:30. It’s not very likely that a number of subs bought at that specific time, but instead, one sub or some other trader not necessarily a member of P123 bought a large number of shares. So far, it looks like you may not be competing with other R2G subs.

Since a lot of the best public ranking systems, Sims, & Ports have been around for a number of years and the high number of P123 members that have used them prior to R2G haven’t seemed to affect their performance yet, we may be more concerned about excessive trading than is justified by the data.

So far I haven’t traded any R2G Ports besides my own. That is because I am trading 11 of my Ports now, and I am satisfied with their diversity and performance. If at a later date I decide that P123 isn’t as much fun as it has been in the past, or I want to do something else with my time, I will probably become a sub. So far, it is more fun and a lot more rewarding than video games!

Denny :sunglasses:

While I very much appreciate your trust, I was hoping I had provided enough information that the market timing was transparent. The document on the general tab for my classic strategy provides results without any hedge, as well as several other variations. I have offered to provide my specific hedge rules on request to any paid subscriber, and because of questions early on, I sent the hedge information to all Classic subscribers on 9/16. There is no other market timing in the Classic strategy. If you or any other subscriber of my strategies wants the details of the hedge used in my Classic strategy, please let me know and I will provide.

Don

I rank ports not so much on annualized return but on rolling 1-year returns starting each day. A calendar-year return is not necessarily any more or less important to consider than any other 12-month period. Few investors buy on New Year’s Eve and then sell exactly 1 year later.

So what I am looking for are ports with rolling 1-year returns that are alway positive. One can assemble a few ports in a book to achieve this easily. I used 5 ports, including 2 of mine, in a combo having a min 1-yr rolling return of 15.2%, CAGR= 35%, and max DD=-12%.


Combo large caps.pdf (98.6 KB)

Georg,

I also like to evaluate models by rolling 1-year returns. It provides a feeling of safety, together with the Sortino ratio and MaxDD.

Did you do your analysis manually? I have been looking for an option to evaluate rolling 1-year returns or even better, I would be very interested in a benchmark like “number of days underperforming the benchmark”. I think many models generated outstanding returns in early years of the backtest, which count towards the overall annualized return. However, we as human beings become very anxious quickly, if we buy into a new model and see that the model underperforms the market for a significant period of time.

Just as any other in-sample statistics, rolling returns or “number of days underperforming the benchmark” are not foolproof, but might add a supplementary margin of safety in evaluating consistency.

Best,
fips

fips,
I do this in excel. You download the performance spreadsheet from P123. Then for every day you check what the price was 1 year ago and calculate the %-change. Then you plot the graph.

I agree with you that there are many models with huge returns in the early years which push up the annualized return. I calculate Sharpe ratios of rolling 3 months returns. The higher the Sharpe the more consistent the model’s return. Then I also plot histograms of monthly returns for the model and benchmark, similar to what P123 has under “Performance-Stats”. You want to see a normal distribution with the model’s histogram shifted to the right relative to benchmark. Also a plot of terminal values of recurring $1.00 annual investments is useful. It simulates saving over time.

Here is an example: Combo3 which is the book of my 3 ETF R2G models. http://imarketsignals.com/2014/im-best-combo3/
Figure-3 shows the rolling 1-year return for Combo3 and SPY. There were only two short periods when the combo underperformed SPY, in 2004 and 2006.
Best,
Georg

I don’t objectively rank R2G ports, yet, but I do have rejection criteria.

I look at excess performance for 2014, a proxy for OOS, and compare it to previous years. All-too-often 2014 is the worst year ever for many R2G ports, sometimes by an embarrassingly wide margin. IMHO the risk of curve-fitting is far too high, and I reject the port from consideration for subscription, no matter how stellar the overall performance and OOS have been in the past, even if 2014 has been an OK year.

I also usually reject R2G ports where 2014 is one of its worst years, and 2014 excess is significantly worse than the overall excess average.

I’ll consider a port with a bad 2014 if it has had at least a couple of years in the past that were worse.

I am now seeing a few R2G ports from the time of R2G launch where 2013 and 2014 are the worst years ever, strongly suggesting curve-fitting.

Randy

Rallan,

I agree with caution in investing in any R2G or system you did not build and test yourself, including mine. But… just wanted to ‘waste’ some time and mull over the possible factors involved.

‘Over optimization’ is only one source of potential underperformance and it’s probably way too simplistic to view that as the sole cause of any system that’s underperformed.

Look at, for example, the Wisdom Tree ETF’s. They are based on the research of the very prominent (and methodoligically rigorous) prof. Jeremy Siegel. They are basically ‘fundamentally weighted’ indexes. So, for example, EPS, weights the SP500 on earnings instead of market cap. There is a very long back test history to suggest this single factor has alpha. 500 positions are held or so. Since launch in 2007, the index has matched the ETF SPY (64% or so returns). But significantly lagged an equal weighted SP500 index. The actual ETF has had some style drift and has underperformed (60% vs. 65%). Even with over 7 years of history and 500 holdings style drift has occured. Many of the widom tree ETF’s have lagged - with hundreds of stocks and one factor. I don’t think they are ‘over fit.’

Or, look at GTAA, an ETF based on the work of Mebane Faber and very simple in theory. It’s returned 3.45% total since launching in 10/2010, vs. 69% for SPY. That makes investors unhappy, but doesn’t make it curve fit. It does mean an investor likely did poorly, but it doesn’t negate the validity research that grounds it.

Or Cambria (Mebane Faber’s) Foreign shareholder yield, FYLD, has returned -4.08% in 2 years vs. 14+% for SPY. But, SPY is the wrong index.

Or USCI - which is based on a simple system of commodity excess returns acruing to momentum and value factors. The underlying index is sound research, but has done poorly since launching 4-5 years ago. I’ve looked at the core research. I think it’s good research.

So… re: R2G’s, one or two years means close to nothing in assessing a system and determining if it’s ‘overoptimized’ or not. Especially systems with 5 or 10 holdings. And market timing. And/or hedging.

For example, do you think Hemmerling’s " Hemmerling Value Rockets" is not curve-fit (32% out-of-sample excess return), but his Russell 1000 rockets is (-14% out of sample excess return)?

Or, looking at Shiguang, do you think his “Alpha Max - 10 Large Cap Stocks w/ Improved Metrics-V4 - No Hedge” with 7% out of sample excess return is not curve fit, but his “Alpha Max - 10 Large Cap Value Stocks w/ Improved Metrics-V5 - No Hedge” with -10% out of sample excess return is curve fit?

Or using my own launched models. Do you think my “*Tom’s SX20 - 20 Value-Quality SP1500 Stocks $2MM liquidity + Hedge” (7% excess return since launch) is not curve fit, but my “*Tom’s SX10 - 10 Contrarian US Mid & Large Cap SP Stocks + Hdge” is curve fit? I don’t think either one is curve fit. I’ve review what’s ‘inside’. That’s me. They may not make people money. But, I don’t think overfitting is the issue.

In terms of the Contrarian, I’ve ‘decomposed the sources of poor performance.’ They come down to -8% to a hedge loss, -5% to the ranking system factors lagging in this year and -2% or so to ‘style drift’ of the 10 stock system around the ranking. I’ve reviewed everything inside and think the system remains sound. I have updated the hedge to weekly rather than monthly, but I’ve also looked back at the basic underpinnings of them and find it sound.

But, all R2G’s for the most part, may be curve fit, but poor o-s-s performance is as or more likely that deviations in performance come from a) the use or market timing and hedging, b) the small number of holdings, c) varying start dates and random fluctuations, d) the short times since inception and e) (often) the lack of proper benchmarks are much greater source of year to year variation.

If I had it to do over again, I’d probably only mostly launch R2G’s with 20-50 holdings. But, no one would sign up. But style drift around the underlying indexes would be much less. And, I’d separate all timing and hedging into stand-alone modules. And limit rankings to 10 factors and remove most buy and sell rules - except for a small number of quality and liquidity filters. But, that’s just me. That’s what I’ve done on most of my systems with updates since launch. That’s where most of my money is invested. But, there are many other ways to build and test profitable systems - including 1,000’s of factors. So, I don’t claim that’s the only way. It’s just where my personal conviction is heaviest.

Some systems work after launch and some don’t. But, very good short-term performance likely shouldn’t give that much more confidence that a system is not overfit than mediocre performance. The holdings are too low and time frame too short.

Best,
Tom

I would disagree. Technical market timing such as MA crossover that Mr Faber uses by definition came from curve fitting. It has random chances of success. This comes up clearly if you backtest these strategies on other market indexes such as the French CAC40 where it produces wildly different results depending on the starting period (same for the SP500).
Conceptually it’s easy to understand that markets follow macroeconomics. A change in economic activity, exchange rates, interest rates will affect capital flows more than MA crossovers ever will.
Thus basing a tactical asset allocation strategy on trend following is for me incomprehensible. It will be slightly better than flipping a coin since trend following does capture to a small extent investors sentiment but that is about it.