backtesting universes and GICS groups seems to give much too rosy results

If my only rule in a screen is <universe(prussell1000) = true> and I run a backtest from 1999 to today against a benchmark of the russell 1000 index with dividends, with rebalancing every 3 months, why is my result an annualized return of 7.91% while the benchmark gets 4.76%? The same discrepancy happens no matter what universe I run–if I’m running it against the same benchmark, I’m getting much better results. This goes for NASDAQ, S&P 500, etc. And that’s with a .68% slippage and a 1.5% carry cost. It doesn’t matter if the benchmark includes dividends or not–it’s the same discrepancy.

I ran across this when I was backtesting various industries using the GICS rule and every single industry I tested beat the S&P500 handily. In fact, I can’t find ANYTHING that UNDERPERFORMS the S&P 500, even the worst-performing sector over the last 10 years (financials). Try, for example, <universe(sp500) = true and gics(40) = true>.

An index’s methodology is usually more involved than defining a universe and tracking the daily performance of an equal-weighted portfolio.
Because the screener can only currently hold an equally-weighted portfolio, most indices cannot be reproduced by backtesting a screen.

But why are ALL the indexes underperforming the backtests, and by such a huge degree? If it was just a matter of not being able to accurately reproduce the index, then the discrepancy wouldn’t be so big and it wouldn’t be so one-sided. I can understand that with some indexes, the backtests include dividend payments and the indexes don’t. But that’s not the case with the Russells, nor with the S&P500, which includes dividends.

Other reasons…

The S&P index does not include dividends. Try using the spyder 500 etf as benchmark.

Russels universes are only reconstituted once a year

THE MAIN REASON:

Equal weighted handily outperformed prior 2008 . in fact papers were written about it: the alpha in plain sight. If you try the past 5 years that alpha has disappeared.

You are seeing outperformance of equal weights from the past.

That makes sense now. Indeed most of the improved results are in the 2000-2010 period. Thanks!

Here’s a quick example of how to replicate the benchmark using the custom series tool: Russell1K.