Slope is (Top Bucket - Bottom Bucket)/Constant

I am sure everyone has been riveted by the discussions of whether to use top bucket minus the bottom bucket or to use the slope of the rank performance test.

THIS ISSUE IS NOW DEAD FOR LINEAR DATA. That is because they are the same thing.Whatever weakness one has the other has.

The slope is just the top bucket minus the bottom bucket over the delta X for a linear function (by definition). Delta X will be the same for each calculation assuming you use the same number of buckets each time (a constant).

Point: people who advocate slope ARE USING TOP BUCKET MINUS BOTTOM bucket.

Those who have been watching this riveting debate sponsored by P123 should change to another channel. I hope everyone enjoyed it.

Please tune in tomorrow when everyone randomly changes their opinion.

Nonlinear stuff is different.

-Jim

Jim,

Our data isn’t linear. If you have more than two data points and they’re not in a straight line, the slope of all the data points together is going to be different from the slope taken from just the first and last (or top and bottom) data points.

Let’s say you have five buckets with the following returns: 2%, 8%, 10%, 4%, 6%. The slope would be 0.004, the difference between the largest and smallest buckets would be 0.08, and the difference between the first and last buckets would be 0.04. Now let’s say you switched the second and fourth buckets, so that your five buckets were 2%, 4%, 10%, 8%, 6%. The top and bottom and biggest and smallest buckets are unchanged. But the slope has tripled to 0.012.

Let’s say X is the average of all x values and Y is the average of all y values. Then the slope is the sum of (x - X) * (y - Y) divided by the sum of (x - X)^2.

I’m not saying that slope is the best measure of a ranking system, but it does take into account more data than the difference between the top and bottom buckets. And unless the buckets are in a straight line, it’s not the same.

  • Yuval

I prefaced my comments with the assumption that the data was linear (and said nonlinear data is different).

I am not sure we disagree on the calculations.

I was just focusing on one particular situation, where things were pretty linear, while highlighting that it did not address all of the situations. I guess it was implicit that one might not be using a line if things were nonlinear enough.

Perhaps the biggest problem with the slope of a regression—as any textbook will tell you— is outliers. And the worst place an outlier can be is at either extreme end. And in my experience the biggest outlier is usually the top bucket with the bottom bucket being the second worse outlier. Walter has an example of that and there are other examples in the public ranking systems. But I suspect you only have to look at your own systems for verification.

As it is, we at P123, generally have the worst situation possible for using slope. This is not separate from the problem of nonlinearity and of skew.

I think the outlier problem is about as serious as it gets. One may still want use slope but you have to be in denial to not be concerned.

When the outliers are severe enough you are getting VERY close to taking the top bucket minus the bottom bucket anyway. I will not try to quantify that here but anyone can see that the other buckets lose their impact on the slope if the outliers are significant enough.

Are we in agreement that we do not want to forget the middle buckets and we might at least look at a metric that actually measures these buckets when outliers are a bad problem? I do not mean to imply that they are always a bad problem—I do not have prefect insight into this.

In my case I would just as soon dispense with the fantasy of the slope being a line that fits anything and take the top minus the bottom from the outset. After all, slope is even a little bit off on how the top and bottom buckets are doing. What it tells me is highly dependent on the specifics of the skew and outliers. And it could literally be telling me nothing I want to know.

And again, I do not have perfect insight or knowledge regarding numbers on this but that does not make me love slope. Instead, it makes me dislike it.

I recommend you keep slope nonetheless.

I am not recommending removing anything and you do not need my input for that.

Spearman’s Rank Correlation can pick up some of the slack regarding how rank is affecting those middle buckets for those who do not like slope or think their particular ranking systems has outliers that are too extreme. AND IT COULD SERVE AS CONFIRMATION THAT THE SLOPE IS A USEFUL METRIC FOR SOME SYSTEMS. That the parametric methods are not too far off from what the nonparametric measures are telling you, in other words. I almost always look at both, myself.

What we are doing is ordering and ranking stocks. If that corresponds to an ordering of returns that is a VERY, VERY good thing. I would argue that is what we are about. But for sure, it is worth knowing how we are doing in that regard.

Anyway, good discussion. Your points are good ones. I am learning something even if we are boring Nisser. Nisser, no worries I am working on a presentation on my iPad now. You can print it out and put it on the refrigerator.

The good discussion is more important than what finally happens regarding the metric.

-Jim

There is an alternative to slope by linear regression. It’s called the Theil-Sen estimator. You take the slope of all possible pairs of points and then take the median. This is computationally intensive but much less sensitive to outliers. It’s a cousin to Kendall’s tau, which works in a similar manner. This will have little resemblance to top-bucket-minus-bottom-bucket. And it’s free from the problems that plague linear regression.

But whether any of these tools–top bucket minus bottom bucket, slope, Theil-Sen, correlation measures–should be applied to bucket returns of a ranking system is another question. If you’re going long, what you’re really interested in is the performance of the top-ranked stocks, and if you’re going short, you’re really interested in the performance of the bottom-ranked stocks. In neither case are you necessarily interested in both. Beautiful bucket ranks that are in perfect order are very impressive to the eye, but do we want to base a strategy on them?

Here are the problems with doing so. 1. Bucket ranks ignore buy and sell rules, which are absolutely key to any strategy’s functionality. 2. Bucket ranks ignore transaction costs. You can include transaction costs, but that just basically makes all the buckets lower. 3. The performance of the bottom bucket is more or less irrelevant to the performance of the top bucket. Let’s say you created two multi-factor ranking systems whose top buckets both get 30% annual returns, one whose bottom bucket gets 6% and another whose bottom bucket gets -16%. Which one will actually perform better out-of-sample? I doubt the answer lies in that bottom bucket. Or in any of the buckets in-between. The reason is that what works well for a long strategy is not necessarily what will work well for a short strategy. For a long strategy you probably want your buckets to look like an exponential curve, and for a short strategy you probably want your buckets to look like a logarithmic curve. And different combinations of factors will produce different curves. (And annualization or “monthlization” will give you completely different results for all of these than total returns.)

I think the bucket system for measuring rank performance is absolutely great. It allows you to visually interpret the strengths of factors and combinations of factors. By changing the time period and factor weights, you can come up with all sorts of fascinating results. By looking at the curve of bucket ranks, you can come up with ways to deal with factors whose best range lies in the middle or slightly to the right or left. But in my opinion–and here I’m not speaking as a Portfolio123 manager but as an investor who enjoys designing strategies–I think that applying quantitative analysis to these results may be less useful–and more subject to the risks of overinterpretation–than looking at the returns from a rolling backtest. And that’s because I’ve done that (applied quantitative analysis to bucket returns) myself, intensively, and have found little persistence in the results.

Yuval,

I appreciate your interest and the discussion.

-Jim

There is an alternative to slope by linear regression. It’s called the Theil-Sen estimator. You take the slope of all possible pairs of points and then take the median.

Unless NAs are eliminated, there is not much point in getting too elaborate.

If you’re going long, what you’re really interested in is the performance of the top-ranked stocks, and if you’re going short, you’re really interested in the performance of the bottom-ranked stocks. In neither case are you necessarily interested in both.

This is where I have to disagree. I am interested in whether the ranking node or ranking system is behaving as I expect it to behave.

Here are the problems with doing so. 1. Bucket ranks ignore buy and sell rules, which are absolutely key to any strategy’s functionality. 2. Bucket ranks ignore transaction costs. You can include transaction costs, but that just basically makes all the buckets lower. 3. The performance of the bottom bucket is more or less irrelevant to the performance of the top bucket.

Try to think of this as a development tool not a trading strategy. You don’t trade an RS directly so why are you trying to impose trading strategies here?

I think the bucket system for measuring rank performance is absolutely great. It allows you to visually interpret the strengths of factors and combinations of factors.

The bucket system for measuring rank performance isn’t great, not without a lot of screwing around by seleting different time periods, and frequencies. This is why I want to see a graph over time, to see whether the ranking system is “behaving” unto itself, putting aside top bucket and trading strategies.

An alternative that I would be happy with is to show the rank bucket performance as a Sharpe or Sortino Ratio instead of an average. This would be more meaningful.

Steve

these discussions constantly muddle up 2 different issues
(1) that several P123 users would like to be able to see how a ranking system factor has been performing over time
(2) what is a good measure of ranking system performance

Unfortunately we always end up getting so lost in point 2, that point 1 gets forgotten about.

Very true. But we do get to see how a ranking system factor has been performing over time if we click on the “performance” button rather than the “annualized return” button. There’s a lot you can do with these results, especially if you download them. It’s a treasure trove of data. What other data would P123 users like to see besides what’s in that chart and download? Should P123 offer interpretations of the data that we do provide? (That seems like a dubious project to me, but maybe I can be persuaded.) Providing more data might be good. I myself would like to see a combination of rolling returns and bucket ranks. But I have no idea if that’s what other folks want, nor how feasible such a thing might be.

IR. Information ratio of the last bucket and/or in the sim: Parker’s (Miro’s) idea.

Could be applied to single factors and multiple factors alike.

-Jim

I would like to see alpha for the top bucket (or all buckets) on a rolling basis.

That way I can determine when exaggerate alpha may have existed and whether there’s any alpha left. The attached plot show a strategy’s alpha. And since there was an unusual amount pre-2006, I reran that simulation from 2006 onwards. That way, the early over-performance doesn’t overwhelm/distort the overall equity curve. Hope that makes sense.

Walter


Walter,

I like this. I would like if it wasn’t just on a rolling basis ( but as an option) and one could measure the alpha for the top bucket over the period selected.

I like it a lot. Doesn’t stop me from liking IR too.

-Jim

Yuval,

I have a question. Is all of this computer intensive and hard to implement?

Or is it a matter of P123 watching out for our best interest? Knowing what is best for us?

I am happy to consider just the pro’s ideas if it is hard to implement. And maybe not even then if it is too computer intensive. I understand the setting of priorities. But these are good ideas requested by pros.

Walter seems like he is a pro and I know Parker is.

I just have a BS in mathematics and Chemistry and I attended UC Berkeley for a while intending to get a degree in Physics (I did not get my undergraduate degree from there). I do not have an advanced degree in Mathematics or anything related. I took the easy route and went to Medical School so I am not too smart. I can see why you would not want to use any of my ideas on metrics.

Walter has made this request before as did Parker regarding IR.

Of course, P123 may know better.

-Jim

I would like to see some of these things happen, too. But, from P123’s perspective, who will use these new features? Probably just a few of us. The same few that spend time on the forums. On the other hand, P123 hasn’t made any significant additions to the its core functionality in a long time. I have submitted feature requests that I think are more important for moving P123 to a fuller, more robust platform. I wish I could see P123’s development roadmap. It would ease some of my angst.

Walter

PS I’m no pro. :slight_smile:

Walter,

I share your angst.

Who are we attracting on the research side?

-Jim

There are probably at least a hundred ways we can improve our product for existing members and to attract new members. Priorities have to be assigned, and members of the P123 team (including myself) are assigning them. You’ll definitely see a lot of new things rolling out in the next twelve months, most of which will please you a great deal.

My own thinking, which may not reflect that of others on the team, is that the highest priority should be given to projects that are not within the current capabilities of the user. A user CAN figure out the information ratio or the alpha for a simulation using Excel without much difficulty, or can subtract the bottom bucket from the top one, but CAN’T upload customized data, assign dividends to realized transactions, access European data, perform buy-driven simulations, hedge in ETF simulations, add a universe to the “rating” command, access data for maintenance capex or executive compensation, use the optimizer with a formula-weight simulation, etc. etc. etc. I’m hoping that all of those capabilities will be possible at some point down the road, but I certainly can’t promise anything. Everybody has their hands full. And there are many improvements that need to be made to the Invest side of the platform as well, and lots of marketing initiatives, and all the tutorials and guides need to be updated, and we’re starting a blog . . . And I’m not even mentioning another major new feature that will probably roll out within the next month . . .

  • Yuval

Hi Yuval,

Did I read you right “access European data”…?

:wink:

Jerome

It’s still a hope and a possibility.