Backtesting dates periods in sample out of sample

I just learned the value of out of sample backtesting and would like some suggestions for starting and ending dates and period duration for in sample and out of sample testing.

I read that the overall sample period should be divided in 3 parts, 1/3 in sample and 1/3 out of sample with no overlap. With the database goings back to 1999 that tells me 5 year periods would be pratical, but I am at a loss as to how to choose them. I am working long strategies. My initial feeling is to pick a
period with no net gain (sp500) beginning to end for both periods (in/out). Any help is appreciated.

Mark,

I like to always start Sim development with 50% out of sample, but I do it with 100% of the data period available. To do that I use EvenID =1 in a buy rule or in a universe rule (I prefer the universe rule). By doing that, I am able to develop my Sims using only 1/2 of the stocks available, but test over all the market conditions available in the data. After I think I have a good Sim, I change it to EvenID =0, and retest. If the retest shows similar results then I remove the EvenID rule. There are 2 advantages of this approach:

By doing that, if the Sims are similar, I am able to confirm that I have avoided nearly 100% of specific stock data mining. And I am able to compare the 2 Sims over the same market conditions. That is a big problem with developing a Sim “in sample” over part of the data available and then comparing it over a different time period “out of sample”. With that approach, any differences are difficult to determine if it was due to data mining or different market conditions.

I contend that within the 15 Years of data available, there are 4 distinctively different market condition periods: The DOT COM era 1999>2002, The recovery from that recession 2003>2007, The banking derivative and mortgage collapse 2007>2008, and the recovery from that recession 2009>2014 (are we in a 5th period now?). It could be argued that the 2 recoveries are similar, but they are certainly not the same. How do you develop over one part of the data, and check your Sim for robustness over a different time period without much of the differences in the comparison being due to market changes? About the only tool you have available is the alpha relative to the benchmark since most of the other stats aren’t relative.

Thank you, Denny. Sounds like a good action plan. I’ll give it a go.

Hello Deny,

how do you use evenID in a universe rule, i do not see an input field for this? (how to, technical question).

Thank you

Andreas

Andreas,

Use ‘EvenID=True’ or ‘EvenID=False’ in a rule.

Walter