Evaluating steady returns of an R2G model

Hi all,

p123 has developed a great way to screen for available R2G models and to compare them. Did anyone find a way to evaluate the steady performance of a model without visual inspection?

To explain what I am after, I attach the below figure. It shows two random R2Gs with similar averaged annual performance. Model A’s performance is centered close to the hypothetical average (green line), so the capital curve is steady. Model B, in contrast, has strong outperformance from 1999 to 2001, but then the performance drops off somewhat. Hence I would describe this that the capital curve is steep and then flattens out a bit.
Do we have a statistical value (standard deviation from mean performance) which expresses the shape of the capital curve for an R2G?


[quote]
Do we have a statistical value (standard deviation from mean performance) which expresses the shape of the capital curve for an R2G?
[/quote]I don’t think that it’s available without downloading the data and doing it ourselves. It’s my favorite metric and if you make a feature request then you have my vote!

I think R-Squared does that - and it is available. Otherwise I’d simply use the Sharpe Ratio.
Steve

I do this manually in Excel by analyzing the performance listing downloaded from P123. Here is an example of a book with steady performance over the backtest period.

I want to see a straight upward sloping performance curve when performance is plotted as logs, Fig-1. The ratio of performance-model/benchmark-performance should also be constantly upward sloping. The Sharpe of the rolling 3 months return should be high, months with negative returns low and quarters with negative returns low.

I look at Fig-2, the calendar year annual returns. I am not interested in models where the deviation from high to low is large. It does not help to have 200% for one year and the remainder low.

Also I want to see a 12-months rolling return that is always positive, Fig-3.

Then I look at Fig-4 where I want to see a normal distribution of the monthly returns.



Combo5 fig1.png


Combo5 fig2.png


Combo5 fig3.png


Combo5 fig4.png

r^2 on P123 (and in CAPM) is a measure of the correlation betwen your model’s returns and the bench’s returns. Neither it nor sharpe will suggest much about consistency over time. We have played with calculating r^2 for the Ln of the equity curve as a way to measure consistency. While not completely satisfying, it’s helpful. A higher r^2 here would suggest greater consistency. I’ve attached an example spreadsheet that uses weekly returns from the public ranking system R7_Filip’s Super Value 76.0.


Filips76_Example_Equity_Curve.xls (183 KB)

Hi all,

Thanks for the comprehensive feedback. I hope the steady capital curve of Model A speaks for a good out-of-sample performance. I launched this model earlier today:

https://www.portfolio123.com/app/r2g/summary/1326976

:slight_smile:

Statistics are generally based around linear models.

The R-Squared is relative to the benchmark, which is what the port should be trying to outperform. The benchmark will increase in a non-linear fashion, over time it may look exponential (or it may not).

It really depends on how you define consistency. If you feel that your port should be consistent through 2000-2002 or through 2007-2008 regardless of what the benchmark is doing then your expectations may be a little bit too high. Attempting to get this sort of performance out of backtest data is simply data mining at the system level (in my opinion).

I think I know what your real issue is… You are optimizing your system and ending up with great performance for the early years but poor recent performance. My solution for that problem is to ignore anything before 2007. Optimize your ranking system and overall system for 2007+. I found that in 9 times out of ten, you will still end up with good performance prior to 2007 but the system is more relevant to today’s market.

Steve

Hi Steve,

I’m certainly happy with my model A. However, the question in the end is: How predictive is a steady capital curve in predicting steady out-of-sample performance?

Below I plot 100$ returns (log2 y-axis) vs time for Model A against two R2Gs. These R2Gs have been launched over a year ago and have a p123 out-of-sample rank > 90. My idea was: if Model A has a similar steady capital curve, perhaps that is a good attribute to look for when trying to assess out-of-sample performance. Time will tell…


Have you tried plotting the excess capital curve, .i.e., subtract some benchmarks weekly returns from those of the models and then plot the resulting equity curve? Doing so might better reveal the volatility inherent in the model vs that in the market.

“How predictive is a steady capital curve in predicting steady out-of-sample performance?”

When we figure out the answer to this question, let’s shutter P123 and go take over the world’s financial markets.

Why 2007 in particular? Is that just a good rule of thumb?

I generally look at 1 and 3-year rolling alpha. I have noticed that all of my models did exceptionally well in the early 2000s but alpha has dropped since then. However, alpha has also picked up in recent years across the board starting in 2013 or so. This may be due to stock dispersion increasing around the same time.

Hi all,

For model A the performance is purely based on the ranking system. Hence, the steady capital curve is achieved by using a mix of factors that worked well over the entire period. If I change some factors or introduce new ones, the performance becomes skewed, typically towards better performance in the early 2000s (similar to what is shown in model B). Whether the factors work well in the future is another matter.
Interestingly, though, models with steady simulation returns seem to be better suited for out-of-sample performance. This is at least the case for Aurelien’s Aggressive Value R2G and Wuu’s Twywy 5 stock HG EMA SYS R2G, both plotted in the above graph ‘model vs out-of-sample performance’.
If anybody has a better way to judge if a model is fit for out-of-sample performance, please let me know!

Makes sense,

I would normally pick a model that has done well over the Max period then check that it has done well over the last 3-5 years: to make sure there hasn’t been a change in the market. Should select the same ports most of the time.

I don’t believe there is a way to automatically do this with P123. All existing statistical figures are comparisons to a benchmark. I believe you would need to ask Marco and his team of wizards to add this with a new feature request.

I have now added a feature request. Please support:

https://www.portfolio123.com/feature_request.jsp?view=&cat=-1&featureReqID=1162

Sister - which of the three graphs you displayed previously do not incorporate market timing?

If this were to be implemented it would simply encourage a new generation of market timers / backtest optimizers. We are trying to eliminate that, not encourage it.

Steve

Good point, Steve! I equally discourage market timing. Here the various models for comparison:

Model A, no market timing - https://www.portfolio123.com/app/r2g/summary/1326976
Model B, no market timing - https://www.portfolio123.com/app/r2g/summary/1193626
Series 2 out of sample performer (red line): no market timing - https://www.portfolio123.com/app/r2g/summary/1049452
Series 3 out of sampe performer (green line): gone to cash Sept 2008- March 2009 - https://www.portfolio123.com/app/r2g/summary/1129179

As per 7sisters request, this is the equity curve for “Aggressive Value” without market timing for year 2008 and 2009.


Thanks for sharing, Aurelien!

I plot mine (Model A) for comparison below. You beat me both in Sharpe and overall winners, so I still have a lot to learn…


And lastly, Model A with a slightly modified ranking system and no buy rule.
= 2008 return +61%.
= annual return 1999-2015 +86% at a 1500% annual turnover.

I intend to trade this as life portfolio in the future.