NEW: Rolling Test Tool is now avaialble

Dear All,

Our new tool for running hundreds of rolling simulations is live. Please note that this is a “beta” release and service might be interrupted.

In short it runs hundreds of 1 year simulations (period can be from 0.25 to 3.0 years), collects the main statistics and lets you see them in charts. These “Rolling Tests” run on a brand new server with 64 processors, so they do not affect normal operations. We’re currently limiting the number of simulations per test to 100, but we should be able to easily run rolling tests of 500+ simulations in a couple of minutes.

The goal of this tool is to understand if a system is curve fitted. Or to get a sense if a system that simlated 30% annnualized for 15 years “got lucky”. We are still trying to figure out what to do with all the data that this tool produces and how to interpret it. You can also download the data to do your own analysis.

To run a rolling test for one of your Live Portfolios or Simulations, go to TOOLS->ROLLING TESTS and click on New->Rolling Test. You can also select one of your systems, then select “Run Rolling Test” from the system’s menu labelled either Portfolio or Simulation.

See images below for sample output and how to run the tool.

Thank you for your comments.

Cheers



We’ll prepare documentation by next week.

Great addition!!
I currently receive the following message: “No server available for request. Please try again later”

Macro,
Thanks for keeping us busy over weekend.
Still getting “No server available for request. Please try again later”.

Terry

Hi Marco,

what a great feature!!! :slight_smile: :slight_smile:

Please note:

  1. I get the same error: “No server available for request”
  2. I am unable to modify the start date of the test. For a 3year period with 4 week intervals, the start date should be February 2002 if my model starts in January 1999. However, the start date is displayed as 2008.

Looking forward to see this tool in action and implemented in the R2G standard displays, as it will expose the market timers…

It is an OK feature. However, the rolling backtest as implemented, doesn’t truly reflect the function of the model, as the model shouldn’t be in a continuous startup phase. When the model starts, it has 100% cash and it will buy the total number of stock required to use up the cash. This causes the model to choose stocks much lower down in the ranking system than it normally would. Repeating the startup phase over and over again doesn’t capture the true function of the model.

The other problem of course is that there is no in-sample marker to distinguish the point in time when the model was released/launched.

Steve

Marco,
this is a great feature, thank you !

Is this new “Rolling Test” also available for R2Gs ?

Yahoooooo! R2G please, please, please!

Serer is back. Thanks

Marco,
I like it! More variance in my sims than I would have thought: interesting and probably useful for my private sims. I can see how it might be particularly important for R2Gs.

Things I particularly like:
Does not have to be run every time for private sims.
New servers!!! Whoo hoo!
I like the histogram.

Suggestions for use of all that information:

I find myself wanting to find the percent of rolling returns > xx% annualized return. This could be put in table or graph form.

I would be interested in the median rolling return. A different color for the histogram bar for median return would be nice–a dotted line if median is not represents by one of the bars. Maybe hash marks for < 10th percentile and for > 90th percentile of returns.

Good the way it is!

Crashed again. Sorry

Yes, I think it would be helpful to seperate out the IS and the OOS. Basically like the spreadsheeet that was posted a while back where you had to manually download data. Where the IS histogram and the OOS histogram are overlayed on top of each other.

Hi Marco,

I was finally able to run the tool starting from 1999. I chose a model that is purely based on a ranking system and has no entry limitation such as Rank>99 or the like. It has a simulated performance of over 100% in 2008 and 2009, yet the backtest shows a negative rate of return in 2009. How is that possible?

The only explanation for me is that the rolling test takes random stock picks and not the top ranked stocks that the R2G picks by default. If this is the case, this type of test is meaningless and should be modified to honor a models ranking criteria.

Furthermore I notice that the rolling test suggests a much better performance from 2003-2007 than from 2009-present. This is in contrast to the R2G which has a straight capital curve all along when plotted on a log scale.

Perhaps I just don’t understand what this test is really doing…



Seven,

Awesome sim!

Click the data button to see where the negative returns are and then run your sim over the same dates. This will confirm any error.

Thank you for adding this tool. I look forward to seeing the documentation next week (I am still waiting on adequate documentation for the series tool).

Scott

I am trying to figure that out too.
Should the histogram show a normal distribution ranging from negative to positive returns?
Or should the histogram only show positive returns?
What does the excess return histogram below mean?


Um yeah I don’t really think this tool helps you determine the likelihood of the model being curve fitted.

What this tool does is answer the question: if I were to randomly pick a random start date in the history of this simulation, what is the likelihood that my 1-year return will exceed X%?

I also do not understand the outputs from this tool and am hoping that the documentation next week will be extensive enough to answer our questions.

I agree with MisterChang. The histogram only tells you that if one were to randomly pick a random start date in the history of this simulation, what is the likelihood that my 1-year return will exceed X%? I does not tell you whether the model was overfitted. Only out-of-sample returns over a full business cycle may provide the answer.

Geov, Hoyt,

I think you are right. Like in any programming: Garbage in, Garbage out. As good as the sim itself: no better. Looking at the lower range is possibly more realistic because it is lower and you are mindful that there is a range. If the sim is good it may give you a real understanding of the range. But it says nothing about over-fitting.