Whats wrong with Price/Book???

As you know, price to book is the quintessential value factor. So why is the highest Book/Price (the inverse of Price/Book) bucket negative 38%?!?



Chaim,
I think this has always been the case. That is why Piotroski only buys the companies with 8 or 9 other factors. Note that he also shorted the companies in the highest bucket (book/price) that only had a few of those factors: a low Piotroski score. These companies are in real trouble.

O’Shaughnessy, Zacks and others have published backtest results showing a drop-off for the highest buckets. For Zacks, it is only when you couple a high Zacks Rank with a high book/price that the highest buckets outperform. However, I do not believe that they used that many buckets and I don’t think their tests showed negative numbers. So I can’t be sure that there isn’t something else going on.

Regards,

Jim

Thanks Jim for taking the time to reply. You alluded to two theories. I had a third. Let us lay out the theories and test them.

Theories:
A. The history of the data is too short to reach a reliable conclusion.
B. These companies are in trouble. ‘In trouble’ can be measured by:
PScore, negative EPS, high DbtTot2EqQ, low CurRatioQ, low cash flow or estimates are being revised down.
C. The accounting of book value is imperfect. This shows up in the highest book/price buckets only.

Evidence:
If C is true then the accounting issues would show up as randomness in the middle buckets. Therefore C is not likely the reason for the steep dropoff in the top buckets.
I would reject A as well. Others have seen this pattern in other tests (Jim mentioned a few) and the pattern is too clear over fifteen years to be random.
That leaves us with B; these companies are in distress. Thanks.

Actually, I would not go quite so far as to describe Price/Book (or Book Market) as the quintessential value factor. It was viewed that way for a long time and it is heavily used in a lot of academic research. But lately, it’s been losing some steam in favor of other factors. It’s still OK. But I’m not sure it’s quintessential anymore.

now, back to your observation, it’s not unusual at all and theory B is most likely closest to the truth. But I’d rephrase it to the point where almost any “extreme” factor is arguably questionable. It can reflect the influences of unusual accounting items. It can reflect extreme fundamental situations (such as your example if B). And we have to give homage to plain-vanilla mean reversion.

That’s why it can be so valuable to combine ranking systems with screens/buy-rules that go much further than to define trading liquidity thresholds. An important role for the screen/buy-rues is to carve out a smaller list of candidates that have been pre-qualified to weed out factors that can make a rank less relevant.

Marc, I just confirmed that many other value factors also show a drop in the right most bucket. Could you walk us through your process of how you would you go about figuring out the cause, and how you would screen them out using rules?

Thanks.

Chipper6, I realized I have just started a post on virtually the same subject (before reading this post). Sorry about that… Here it is: https://www.portfolio123.com/mvnforum/viewthread_thread,8981 . Should we merge the posts?

First off, all extremes are suspect. The easiest way to see this is to do a single-factor sort on a broad universe and check the top few companies and look into why they are so extreme in a particular factor. If it’s a value factor, it may well be that the Street expects the company to fall off a cliff. And in today’s world, the Street is not nearly as ignorant as folklore suggests. So if you see a PE of 7, for example, your first gut reaction should be: Dog. This isn’t going to be true 100% of the time. But it has to be the starting point.

There is no science to screening this away. It’s the reason why I prefer portfolios of at least 20 stocks when it comes to investing my money. The goal of the screen/buy rules is to try to find clues that tilt the probabilities in the direction of not-dog. The number of potential ways to do this is countless, but some things one might want to consider working with are: good ROEs, a trend of rising ROEs, increasing estimates, increasing LTG forecasts, net insider buying, good earnings quality (accruals tend to be less persistent so you may want to try to look for lower TATA, which is total accruals to total assets, technical factors that suggest increased buying, etc.

That’s the basic approach. The mindset is: This is very likely to go sour more often than not. What might fundamental/technical characteristics might suggest this company is an exception.

By the way, this is exactly the way the Piotroski F-Score came into being. His starting point was the widely accepted and generally accurate idea that companies with high PB (or as he and other academicians tend to phrase it, low BM) are dogs. His goal was to to see it accounting data (the sort of items we now accept as standard factors or formulas) could help one identify exceptions.

Thanks Marc.

While it is hard to refute the facts, I am struggling to understand the theory. If there is a reason why bucket #200 is so cheap then why not bucket #175? Why is there an upwards slope until the rightmost buckets?

The only theory available is that higher B2M should lead to higher stock prices (based on a chain of logic that passes through the relationship between high BM and high ROE and high ROE to higher earnings growth and higher earnings growth to superior future dividend-paying potential and on to a higher in an abstraction of a dividend discount model). It doesn’t work in individual situations when there’s something that tells us that what appears to be shown by the data isn’t really so for one reason or another. So essentially, everything we do is to try in one of the probably infinite number of possible ways to point our models toward situations where we really are seeing what we think the data is telling us.

Extremes are the easiest places to find false signals, extremes in many things, not just B2P.

I quant terms, the best I can say it is that in the basic formulation of y = a + bx + e, what you’re seeing in your top bucket is a big value for e, the error term. In theoretical formulations, the e is presumed to be random. But often, of you study companies that fall within the e bucket, you may find more regularity than the simple formula supposes.

You can’t address this by going any further with the B2M model. You’d basically have to start by looking individual at the companies in the extreme nightmare portfolio to see what it is that’s inflating book to a non-representative level. Accounting irregularities are possible (a long string of accruals that’s about to run dry). Unusual gains can inflate book. So, too, can expectations of big reductions in upcoming book.

If I recall your sim graph correctly, it seems you had a walloping decline early on and then moved pretty much sideways. What was the early portion of the sim? 1999-2002? If so, we know what a mess things were back then, many situations involving unsustainably inflated book. The question for further study is whether these situations are predictable to the point where a short-oriented trading strategy can be created.

So looking for rational reasons:

If you run a 5 stock screen with quick rank Pr2BookQ lower is better, a huge percentage of the stocks no long exist. Bankrupt?

If I know my bankruptcy law (probably not by the way) any remaining value of the sale of a company’s assets after it declares bankruptcy goes to the bond holders and other debt holders first.

My theory might be that if the price of the stock gets too far below book value, the market is betting that any value the company has (i.e., book value) will end up in the bond holder’s hands–and the stock will end up worthless. So a very low price–think top bucket of 200 buckets–implies a the market thinks there is a high bankruptcy risk. Sometimes the market gets it right.

Speculation but not a huge leap, I think.

Regards,

Jim

[quote]
You can’t address this by going any further with the B2M model. You’d basically have to start by looking individual at the companies in the extreme nightmare portfolio to see what it is that’s inflating book to a non-representative level. Accounting irregularities are possible (a long string of accruals that’s about to run dry). Unusual gains can inflate book. So, too, can expectations of big reductions in upcoming book.
[/quote]Marc, can you walk us through an example? I have this simple screen that selects the cheapest 30 stocks from the Prussell 3000. How would go about weeding this list; (a) which tools would you use and (b) how would you decide if the results are meaningful?

Thanks!

It needs to be pointed out that the highest ranked B2P stocks that don’t go bankrupted can explode to the upside when the market comes out of a recession. I ran a backtest of Chaim’s screen for the 2 months after the last recession ended in March 9 2009 and I got a 450% return. That’s not an annual return, but the total return over 2 months! Here is the backtest:


Nice find Denny. Here is a screenshot of the holdings during this period. Notice how nine out of the thirty stocks lost more than 30% during this bounce.


And look how many stocks are no longer trading!!!

Denny,

Any idea how you could time an entry into ports that behave like this? I have one that is too volatile for me to trade but when timed properly the returns are extreme.

I hate to rain on any parade, but notice that this was an unusual once-in-a-who-know-how-long period (the initial recovery burst following a crash) and that most of the positions are penny stocks that might or might not have been genuinely tradable at the prices indicated in the database.

Here’s a more mainstream strategy i whipped together in a couple of minutes:

https://www.portfolio123.com/app/screen/summary/144567?st=1&mt=1

The main thing here is a screen which run against the whole Russell 3000 universe produces about 200 stocks that were selected based on factors I expect would reduce the likelihood of PB giving distorted message:

Rule 1 is: rating(“Basic: Quality”)>80

ROE etc. figures prominently here (along with other things related to ROE) and since higher ROE justifies higher PB, that gets us off to a good start.

Rules 2/3 address analyst expectations and are inconsistent with the notion of a company expected to soon deteriorate.

LTGrthMean> LTGrth4WkAgo
or
CurFYEPSMean > CurFYEPS1WkAgo

The current stock list produced by the model is investable. The smallest company has a market cap of $182.4 million, and 22 of 30 stocks have market caps above $1 bill.

Because I know lower PB is better assuming there’s nothing there to distort the signal, and because the ranked list was very heavily pre-quakliifed (3,000 stocks down to about 200) by factors logically related to the idea the whatever PBs we’re seeing are likely to be delivering valid signals, this is the sort of approach I’d be willing to take out of sample.

A very, very high book to market to me means a value trap - a company that is very cheap to buy but there is a reason for it. There are many reasons but they all point to a lack of investor confidence in the long term prospects of the company. I think that is why Marc mentions: a trend of rising ROEs, increasing estimates, increasing LTG forecasts, net insider buying…They all point to a rosier future. Without a good future, a company is not worth buying today, at any price. Having said that, I created a simple rank based upon just Earnings Yield using the SP500 as the universe. It does better, in general, than the SP500 in toto since 1999. Why wouldn’t this just highlight value traps? I think the difference is the criteria that the Standards and Poors Index committee uses to select the stocks for the SP500 does weed out some losers. ie, “Companies should have four consecutive quarters of positive as reported earnings, where as-reported earnings are defined as GAAP Net Income excluding discontinued operations and extraordinary items.” So they pre-screen them and therefore some of the value trap dogs disappear. But even with this, my simple rank fails to beat the market for long stretches of time. I guess the bottom line is, to not only look at valuation but other aspects that create long term company growth.

Humm…

So I wondered if these 9 stocks were the very highest ranked of the stocks when bought, and still probable bankrupt stock.
So I omitted the highest 0.2% ranked stocks with the rule Rank < 99.8. That improved the total return to 518%.
I then changed the backtest to weekly instead of 4 weeks and that improved the total return to 631%.
I then changed the number of stocks from 30 to 10 and that improved the total return to 695%.

So now I realize that all I have to do put all my money under my mattress until the next recession is over (however I determine that) and then put all of it into the highest ranked 10 B2P stocks (except for the highest ranked 0.2%) and rebalanced weekly for 2 months! I will then achieve a Gain/Stock/Day of 696/62 = 11.2, and then I can retire! Oh, never mind, I am already retired. :smiley:

Yes, that is true, but if I change the universe from PRussell3000 to tradable stocks with AvgDailyTot(60) > 1000000 & Close(0) > 2.5 then the total return is still 248%.

If we want to envision quick gains of 400%-660% as what’s available inside an exclusive club, we can do that. But to get in, you have to get past the big thug manning the door, who goes by the nickname Dr. DD, which is short for Dr. Draw Down. (It looks like the timing models used by many in p123 were on the sidelines during the entire period of the super test.) If you’re not willing to show some love to Dr. DD, you’re not going to get an opportunity to enjoy the good times inside the club. (And even if you get in, you’ll need special club curency because as I mentioned, these are penny stocks that are not likely to be available for purchase at those prices if you try to redeem using U.S. dollars.)
:slight_smile:

Marc, thanks for the sample. Is this the way you do it; by theorizing and testing various rules, or do you look for some common denominator first among the chaff by using some other tools?

Chaim

I’m not sure I understand the question.

But yes, this is how i work. It starts with an idea of what should work. Actually, it’s known for sure the “idea” does work. No testing is needed for that. It comes entirely from theologic behind the pricing of stocks. But what we know is logic. All the uncertainties come from (1) the inputs and (2) the expression.

We don’t deal so much with inputs at p1234 because we’re not using expected return type formulations that are computed based on factors and coefficients. For example, I may have a model that compute expected return as a + .(12 * 5YEPS growth) - (.83 * PE) . . . and whatever. For that to work, you’d have to input EPS growth, PE, etc. with a credible degree of accuracy.

For us, expression is the main thing. How can we translate ideas into language the server can process using the data it has? The question is simple. The answer is not. (If it were, we’d all be thumbing our noses at Donald Trump viewing him as a pauper). When I test a model, i’mont testing the idea. I’m only testing the reasonableness of the way I expressed it in p123/Compustat language.

So in this sample, I know low PB is desirable (the lower the better() if it’s not picking up what David correctly described as a value trap. Expressing low PB is easy. We have a pr-build PB factor and can easily put it in a screening rule, a ranking system or a quick rank. The challenge is figuring out how to tell p123 to eliminate value traps. The screening rules in my same model is one of countless possible ways of doing it. The test suggested it’s got at least some credibility. Another way I verify the reasonable of the expression is to look at a sample of a stock list produced by the model. I need tradability and consistency with the spirt of the law as per the model. The final check is to break down the periods into smaller backtests. I don’t typically like to work with the Max period because the market changes over time and I want to check to see if my expressions remain satisfactory to Mr. market now and potentially going forward. (I tend to ignore 1999-2002 and 2008 because they don’t answer these latter questions).

That’s it.

  1. The idea - traceable to logical market/financial theory
  2. The expression of the idea in p123 language (by far the hardest step)
  3. Testing the reasonableness of the expression of the idea
  4. Reviewing the right-now relevance of what I’m trying to do.

Bear in mind I’m not an academician looking to discover universal or near-universal truths. So the biggest-possible as-many-different-scenarios-as-possible sampling approach doesn’t apply. Non-probability (purposive) sampling is more appropriate when it’s just some shook trying to make a buck in the market.