Index | Recent Threads | Who's Online | Search

Posts: 168    Pages: 17    1 2 3 4 5 6 7 8 9 10 Next
Last Post
New Thread
This topic has been viewed 5042 times and has 167 replies
ustonapc
All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

https://blog.curtisnybo.com/comparing-backtes...rt-of-trading-algorithms/

Attachment All that Glitter is not Gold.pdf (544757 bytes) (Download count: 43)


Jan 1, 2020 4:51:11 AM       
regallow
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Thanks, James. I found it interesting and informative.

Bob

"Logic is a systematic method of coming to the wrong conclusion with confidence." - Unknown

Jan 1, 2020 2:12:44 PM       
ustonapc
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Thanks Bob, I am glad you find the paper interesting and informative.

In fact, due to the very weak correlations (R2 < 0.25) between the in sample (IS) and out of sample (OSS) performance as mentioned in the paper based on Quantopian's data. The scoring system at Quantopian for submitted algos now includes a 6 month out-of-sample evaluation period.

For Quantiacs, another crowdsourcing quant site, their scoring system automatically calculates the Sharpe ratio to evaluate your submitted algo which will make up the 1st score. They will then simulate your system for three months with live data, which makes your 2nd Sharpe ratio sorce. The lower of the two is your final score. The reason for the arrangement is to rule out overfits (in the backtest) and lucky wins (in the three months live data).

James

Jan 1, 2020 3:45:41 PM       
Jrinne
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Hi James,

I made a New Year’s resolution not to discuss statistics on this forum. But this seems to be a topic of interest for at least one person reading the forum today so I may make a small exception (while my neural-net-training runs on Python).

That is an extremely high R^2 for what we do!!!

For comparison, take something that we all can agree upon (probably). We all agree, I think, that the AvgDailyTotal is correlated to the slippage.

When I look that is there is a correlation, as one would expect, but not THAT MUCH of a correlation. Also any papers about slippage do not have that high a correlation.

Still, I need to digest this. The average Alpha is near zero in the study. The IR correlation is negative and of course the annualized returns are negatively correlated. A high Sharpe Ratio, by itself, is not what I am interested in.

This all leads to one rhetorical (but serious) question: will this help me make money? Rhetorical because I do not want to be put on the spot of having to answer my own question.

Maybe the lesson is find a low variance portfolio (with a higher Sharpe Ratio that may persist) and leverage the portfolio. A strategy the led to disaster for Long-Term Capital Management which was low variance until it wasn’t. But as I said I am still thinking about what the paper means.

Thank you for the post and perhaps my resolution should have been to not post statistics unless the thread is about statistics;-)

-Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 1, 2020 5:47:39 PM       
Edit 6 times, last edit by Jrinne at Jan 1, 2020 6:11:04 PM
ustonapc
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Hi Jim.

You are right about not focusing too much in the Sharpe ratio.

As you can see In the abstract of the paper, they acutally find that Sharpe ratio (by itself) offer little value in predicting out of sample performance (R² < 0.025).However, it also mentions that the latest year (IS) sharpe ratio is one of the better evaluation metric based on Quantopian's data.

I believe they found the overall predictivity including all the backtest evaluation metrics is R2<0.25 which implies less than 25% of out of sample performance can be predicted by the in sample data. I think this level is not high at all.

James

Jan 1, 2020 6:11:47 PM       
Jrinne
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Hi Jim.

You are right about not focusing too much in the Sharpe ratio.

As you can see In the abstract of the paper, they acutally find that Sharpe ratio (by itself) offer little value in predicting out of sample performance (R² < 0.025).However, it also mentions that the latest year (IS) sharpe ratio is one of the better evaluation metric based on Quantopian's data.

I believe they found the overall predictivity including all the backtest evaluation metrics is R2<0.25 which implies less than 25% of out of sample performance can be predicted by the in sample data. I think this level is not high at all.

James

I disagree. I think it is very high.

Quite a bit higher than the designer models annualized returns for example (which is duplicate by the study you link to). Negative correlation for the Designer Models too.

For me it is just that you COULD FIND you have a high Sharpe Ratio buying bonds that pay 2%. High Sharpe Ratio but I would rather be in the market long term. BTW, Long-Term Capital Management thought leveraging the bonds was a winning strategy (which it was not).

-Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 1, 2020 6:24:39 PM       
Edit 4 times, last edit by Jrinne at Jan 1, 2020 6:28:28 PM
ustonapc
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Jim,

If you disagree, I guess you are not disagreeing with the the findings in the paper and with Quantopian's data. Not with me.

I just believe their view to put more focus on out of sample evaulation period when comparing trading algos.

James

Jan 1, 2020 6:35:47 PM       
Jrinne
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Jim,

If you disagree, I guess you are not disagreeing with the the findings in the paper and with Quantopian's data. Not with me.

I just believe their view to put more focus on out of sample evaulation period when comparing trading algos.

James


Actually, not disagreeing.

Just asking if you are aware of anything with a higher positive correlation than what is cited in the study. I guess I could get a more negative correlation than the annualized returns—cited in the study—by burning my money.

But, personally, I cannot think of a study (in finance) with a higher positive correlation. As I said the correlation of ADT with slippage is an example of a study with a lower positive correlation. I sincerely cannot think of one that had that high of a correlation.

Probably just me.

-Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 1, 2020 6:41:46 PM       
Edit 3 times, last edit by Jrinne at Jan 1, 2020 6:45:28 PM
ustonapc
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Instead of leveraging on bonds which led to the collaspe of LTCM, another way to achieve high return/high sharpe is to follow the footsteps of the Medallion Fund (Renaissance Technologies) and run a leveraged stat arb stock trading strategy (statistical arbitrage) which continue to remains a secret sauce at RT.

Jan 2, 2020 2:02:00 AM       
Jrinne
Re: All that glitters is not gold: Comparing backtest and out-ofsample performance on a large cohort of trading algorithms

Instead of leveraging on bonds which led to the collaspe of LTCM, another way to achieve high return/high sharpe is to follow the footsteps of the Medallion Fund (Renaissance Technologies) and run a leveraged stat arb stock trading strategy (statistical arbitrage) which continue to remains a secret sauce at RT.

Yes! That is a VERY interesting method.

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 2, 2020 4:31:15 AM       
Edit 1 times, last edit by Jrinne at Jan 2, 2020 4:31:46 AM
Posts: 168    Pages: 17    1 2 3 4 5 6 7 8 9 10 Next
 Last Post