IMPORTANT: new estimates are now live. Historical data was rebuilt.

Dear All,

We previously announced a problem with the logic for estimates. In short, we were deciding “CurrentY” and “CurrentQ” based on what filings Compustat processed. However analysts listen in on conference calls and revise future estimates right away (most anyway) regardless of SEC filings and providers like Compustat. Therefore we decided to change the concept of “CurrentQ” and “CurrentY” to be more instep with that reality. During this project we also identified other issues listed below.

All the fixes are now live and past simulations will produced different results , but should not be significant. Naturally this varies depending specifics. The data affected is the ESTIMATE data. The historical point in time data has been rebuilt with these changes:

1 - The period pointed to by Current Y/Q and Next Y/Q is now aligned with recent announcements , not what compustat processed. This may include press releases and prelimary reports.

2 - CapitalIq may have multiple estimates for parent company and consolidated numbers. Logic was added to guarantee that the consolidate estimates are used first.

3 - There may be multiple estimates co-existing point-in-time for a company that is changing fiscal month. Logic was added to guarantee picking the estimate with the newest fiscal month.

4 - There may be multiple estimates using different accounting standards. Logic was added to pick the ‘preferred’ estimate (defined by CapitalIq on a period by period basis).

5 - For dual listed companys in USA & CAN there may be multiple consensus estimates based on different number of analysts. Logic was added to pick the consensus estimates based on the larger number of analysts pool, and doing the proper currency conversions where necessary.

6 - Stocks that split on Monday in the past caused problems since CapitalIq adjusts estimates for splits on Sunday.

7 -The S&P500 weekly estimates were also rebuilt. This is what’s used in the “Fed Model” charts. I included a comparison of the previous estimates vs the new . As you can see, for SP500 stocks the differences seem minimal.

Let us know of any issues.


newestimates.PNG

Past simulations are now producing significantly different results.

Questions:

Does this affect quarterly estimates too?

Not sure what you mean by “parent company” and “consolidated numbers”. Can you explain or give an example of such a company (or companies)?

Thanks.

Thank you Marco!!!

Marco & All:

Maybe I misunderstood you. You said something to the effect that you were going to introduce some new functions that we could use to handle (if we so chose to do so) the various data issues. That was a good solution. What you’ve done is quite different however.

On balance my Sim’s performance under the new regime is not in the toilet. Annual return is degraded by about 4%. But this still sucks. It sucks because a 4% annual return compounded over say 10 years is 48%. But the situation is much, much worse…

P123 has a 10 year history of making these so called data improvements. The cumulative result is that it is impossible to create a trading system and actually implement it. Gradually the effect of 4% here 4% there adds up, and the system degrades past the point where its robustly viable.

Apologies for the double post.

Bill

This change has definitely impacted the Hedge Market Timing rules that were based on EPS estimates.

I just randomly tried a few simulations with the same setup. The average annual return drops. Does that means the simulation is closer to reality? Also Marco can you provide one example for what you mean by:
1 - The period pointed to by Current Y/Q and Next Y/Q is now aligned with recent announcements , not what compustat processed. This may include press releases and prelimary reports.
It’s pretty hard to understand the description without example.

Marco,

I also have a few questions.

  1. “The period pointed to by Current Y/Q and Next Y/Q is now ALIGNED with recent announcements”. Does this mean there is some form of guesswork? Has this been “tuned”?
  2. Is the reduction in performance caused by additional jitter in the estimates?
  3. Is the reduction in performance uniform over 10 years, or concentrated into recent years (since we can’t now access the old data, we can’t measure this)
    4.How confident are you that this will improve real accuracy?

David

I’ve seen both increases and decreases in simulated performance. Overall, a much appreciated effort to make data more timely.
Thanks,
Walter

Any revisions to historic data makes this data not “point-in-time” anymore, and therefore it is not “more timely”. I don’t think P123 should make revisions to data at all. There is no point to do so because an algorithm that works with revised historic data may not work equally well with current un-revised data.

For example, a hedge entry rule is based on current data. Now the data gets revised and the hedge may not have been activated anymore. Thus a portfolio or SmartAlpha model with a recent hedge in place may not be hedged at all if one runs the model as a simulation now.

If P123 wants to revise historic data then it should be made clear that this is not point-in-time data anymore, and a list of such data should be published so that one can avoid this data in one’s models. Simulations which incorporate data subject to revision will not match the corresponding portfolios after a few weeks, because portfolio trades are frozen in time, whereas simulation trades are not.

There is a lot of different important issues with what Georg is saying. Too much for me to address in one post really. His points are particularly important for something like Smart Alpha.

But I don’t see how one could argue for permanently constraining the live information available in a port forever. And ideally one would like to backtest the use of that new information.

Failure to adapt is just failure.

The way I understand the issue, this change removes (or shortens) the process lag sources like Compustat may add. In that sense, it does make the data more timely. And of course, this will affect current models but future models should benefit from it. I still see it as progress.

Walter

EDIT: If it’s possible, perhaps P123 could offer a selector to the EPS estimate functions to return the original functionality. I don’t know how dynamic the DB is currently.

I might add to what Walter said. At times the information wasn’t just late it was actually incorrect.

Tell me again why late, incorrect information is ever good. Even if there were a glitch introduced that needed a fix (not proven) why not make progress?

Thank you Marco and team for this improvement and especially considering the dates!!
I find it can even be sa good tool to identify potentisl curvefitting in a strategy that uses estimates.

geov:

 Thank you for your post.  There's a lot of folk that just don't get it or don't care.

Bill

People are arguing whether this changes make data more timely or not. For me, this is an evidence that people don’t really know what “changed” or how the data are generated. We just need more information and clarity to understand what’s going on.

Will all data with revisions be point-in-time? I have an indicator using various data subject to revisions (estimates and economic). I got backtests results for the last 2 months that are different from the signals I got in real time. With both the former and new versions.

J:

You will never have correct data.  It just doesn't exist, and it never will.  What I'm advocating for is a data set that we can use to actually make money.  

Bill

It sounds as if improvements are being made. p123 gave notice there would be a change, too. What users didn’t have was a chance to integrate the new (and hopefully improved) data into the models they are currently using and relying upon to trade.

So, for example, a user could be hedged using the old data and plugging the new data in without adjustment indicates the user shouldn’t be hedged. Is that because the data is better or because the user hasn’t had an opportunity to change the manner in which it should be used?

Relatd, I have a concern about this week’s current S&P 500 earnings estimate. On December 19th the number was 177.82. On December 26th the number is/was 116.74. Were enough analysts really working Christmas week to lower this year’s collective estimate on the nation’s 500 largest companies by almost 1%?

This 1% drop is almost certainly a reflection of how the data is being collected, processed and communicated, wouldn’t you think? What’s going on?

At the risk of repeating myself, I am a huge fan and very appreciative of p123. But these unilateral changes are very hard - and time consuming - to handle.

Hugh

I’m a bit concerned that some users appear not to have known about the change. Maybe P123 could implement a run-time check that would report (under a new tab) changes like this that were made over the last several months. That wouldn’t address all the issues just the 'Hey, my sims are different … what happened?" kind.

Walter

EDIT: Or how about change notifications made at the time of first login post-deployment. I see the CNN website doing that all the time with their terms of service notices.