Discrepancy between live and simulated (backtest) results

I am confused and a bit worried that there is a huge difference between the simulation and the real live portfolio, with exactly the same parameters and the same start date. The difference is ranging from 20 points to as much as 40 points in a year.

Is anyone else facing that? Any advice/guide/help would be greatly appreciated!! :slight_smile: Thanks!!

Huge is a relative term. What are “points” difference ? 20-40 difference from what total ?

Changes are usually caused by past data that is fixed, or by changes/fixes to our engine. Simulations should be interpreted as one possible outcome of many. This is especially true with systems that buy a handful of stocks. Reasonable differences should not be a concern for robust systems. Differences could also be a good “canary in the coal-mine” sign of “curve-fitting” where a system has been tuned to produce the best results for past data.

Sun, you could also have differences if you do not select/trade the recommended stocks in your live ports but select others. I run 5 live ports that I trade and I frequently do not select the one recommended at that time but select another. I know that over time this creates a disconnect with my sim versus what my live port does. My book is 65 stocks and the effect is negligible for me. If you were just trading one port that had just 5 stocks, I could see where substituting a stock every once in a while could cause a sizable difference compared to the sim.

I’m not seeing that problem.

I have a DM with 3 years of out-of-sample data and the resim results are very similar - annualized return of 17.13% (DM) vs 18.25% (new sim). More importantly, the three-year equity curves are almost identical.

Walter

Points means percentage, i.e. 20% to 40% deviance annually, with live system under-performing the simulated. With that kind of difference, I stopped using that live system. However, when I run a backtest on the same system, with all the same parameters, the backtested results looks much better. , It’s 10 stock sim, I understand that lower the numbers of stocks, probably higher the discrepancy. But I was just curious that when some data is back-filled/updated in the system or when Compustat is updated, probably the outcome of the simulation changes? Just a curious thought, I could be wrong.

I do not have this problem either and have been running live systems for several years (I play in the large / mid-cap segments).

Discrepancies are minor (for example compounded return after several years at 45% vs 46%) and due to the fact that I am getting filled at prices that may be and are different from simulated prices (open, close, mid-point) or even on different days when I do not have the chance to rebalance on Monday.

We have seen this concern in a few posts in the past and the conclusion has always been that the backtest data is reliable. I think I remember a few rare cases where the odd micro / small stock had incomplete / incorrect data but that was it.

If your live strategy algo is strictly identical to your simulated algo today, is there any chance it has been changed as it was running live a the time? (so your sim today is actually using a different algo than when you were running it live - even if the live algo today is identical to the sim)

Otherwise, as others have suggested above, there are chances this could easily be happening for a strategy that has just a few stocks and / or has a very high turnover and/or trades small/micro stocks: you might not get the fills, or get them at very different prices than simulated, or on a different day etc… Over time it can quickly make a big difference.

I hope this helps

The delta between actual vs theoretical will largely depend on the liquidity of the holdings versus the size of your portfolio. This is especially the case with illiquid microcaps. Secondarily, the delta is attributable to execution. For example, market orders will consistently achieve close to the theoretical but will never do better and are likely to do slightly worse. Limits, on the other hand, will diverge more from the theoretical but may do better if they capture the spread.

I do not advocate for using any specific liquidity or size filters, but these factors should be considered as part of the investment criteria.

Suneet,

That is enough of a discrepancy that I can understand your concern.

But it is also large enough that I think you have to really be sure there is no difference between the sim and port. Consider:

  1. WHEN YOU CONVERT A PORT TO A SIM THE SIM USES THE PREVIOUS CLOSE FOR PRICES WHICH CAUSES A HUGE ERROR.

  2. if you have checked that (#1) then having buy and sell rules with any difference in the holdings at any point can get the sim and port out of sync forever. And random things like slippage differences can trigger the initial difference in holdings. You should only do this if your only buy/sell rule is RankPos > 10 or Sell = 1 as this resets any discrepancies and brings the port and sim back in sync.

  3. if you have copied a ranking system too (and think they are the same) make sure that the NAs are handled the same.

The concerns about the S&P correcting/changing data has been more theoretical than being a real problem in my limited experience.

Don’t know if this helps much.

-Jim