A Troubling Response

dnevin123 · May 29, 2017, 1:39pm

I recently had a somewhat frustrating/troubling interaction with Portfolio123 staff, and thought I would put it out to the community to get other’s opinions.

Last week I noticed that a number of simulations I had run in the Jan 2017 timeframe were generating fairly different results when rerun at present. I have been using P123 long enough now that I have come to expect this from time to time for a variety of reasons (changes in the underlying compustat data, changes in the calculation of some financial metric I happen to use, etc.) However, while I expect it, I still don’t like it and I like to try to understand the source of the change for education and error-checking purposes.

To that end, I requested from P123 a log that listed the changes to the calculation engine from Jan 2017 to present. This seemed a reasonable request to me as change logs are software engineering 101, and in fact with most software products they are distributed with any rollout of a new version. Unfortunately, in response to my request I was told that such a change log does not exist.

How can this be the case? This seems both ridiculous and deeply troubling to me if the software engineering practices of the company we are using to manage financial assets doesn’t have the ability (or perhaps refuses) to generate a change log. Frankly, the change log should be available for public consumption by paying customers at all times through the website.

What do you guys think about this? Are my expectations unrealistic or outside the norm?

Perhaps this is just a misunderstanding? Any P123 staff have a clarifying response?

Quantonomics · May 29, 2017, 2:29pm

I think your expectations are unrealistic for simulations. Every time you press re-run you have to expect slight changes.

If you want your stuff to be set in stone then create a portfolio R2G that you don’t open to public instead. Then you are guaranteed it won’t change.

mgerstein · May 29, 2017, 3:00pm

I’m not the p123 person with whom you interacted and your post is the first time I’m seeing your concern.

As you accurately noted, changes in data, changes in metrics, etc. do happen and they have been discussed in the forums extensively in the past.

We have also made many changes to the platform over the years. Consistent with what you suggest about Engineering 101, we do know what we have done to our product over time, and in fact, you can easily trace it back in time through the “Recent Feature Releases” thread that’s always visible.

The changes about which you are really concerned now, however, are of a different sort. They have occurred because of decisions in the world outside Portfolio123, and often outside of Compustat that we do not and cannot influence or control.

The central issue here revolves around the differences between engineering (and the physical sciences in general- as I understand them) and finance. In finance, the past insofar as we can articulate and use it is not fixed. There is no equivalent of laws of physics or chemistry on which we can count.

How did the economy perform in 2016? I don’t know. Nobody does or nobody can. The best we can do is refer to the output of some models that were built to describe the economy as best we can (GDP, etc.). But we don’t know if these are really as firm as the what we know about mixing certain chemical compounds or properties of electricity or water flow. GDP is simply a model we agree to use unless and until somebody comes up with something we think mat be better.

And we don’t even really know what GDP was last year? We only know what the Commerce Department thought it was when they finally decided to stop revising based on new information. Ditto for every other economic indicator.

We apply the same thing to company financials. Here, there are certain things we can know with reasonable comfort. What was Apple’s dollar sales in 2016? But it takes time to get the number right, although at least technology is helping internal accountants go from the field to a HQ aggregate a lot more quickly over time. But items like sales, and debt are unusual in their certainty. Many others are a lot harder to pin down and many derive purely from corporate modeling. So when you get down to it, we really don’t know what Apple earned in 2016. Nobody does, not even Tim Cook. All we and he know is what the company’s internal and auditor-approved and auditor enhanced models say the company earned.

And then, of course, databases model the data they receive from the companies. As explained in a data white paper I posted in the Help area when we switched to Compustat, raw “accurate” data would be useless to us on Portfolio123 because it would be impossible for us to compare any two companies. The difference from one data provider to the next isn’t so much accuracy (if a company reports sales as 983.6 million, we can be pretty comfortable assuming all will pick up that number and if they don’t a collection error will get fixed incredibly rapidly) but in the quality of the models they build (data sanitization protocols).

Logging ever4y change in every “model” that can impact your results is a brutal, if not impossible, task and I’m not even sure who would do the logging and who would compile and log the compilation of all the logs. And if it could be done and you were to receive it, you’d probably discard it quickly due to information overload.

None of this, however, should discourage you. While there is plenty of bad news about the absence of precision regarding the past, there is also good news – better news. We don’t need precision and in fact, we have all the precision we need to build successful models of the future, and then some. Is it enough precision to fly an airplane at 35,000 feet or send humans into space? No. But we’re not trying to do that. We’re trying to use the past as best we can model it to increase our probabilities of success in the unknowable and potentially very different future and for this, portfolio123 is magnificently positioned.

The key is how we build that bridge connecting the modeled past to the unknowable future. That’s the role played by our models.

From having screened and modeled since the ‘80s when data-bearing floppy disks went into drive B as the program disk sat in drive A and the drive-door lights blinked on and off as the pc read back and forth between them. I will tell you that I have never ever encountered a situation where the success of my effort or lack thereof was traceable to anything regarding the kinds of issues described above re: how the past is modeled. Never. Not once in many thousands of models over the years.

So, putting all of this together and bringing it back to the changes in sim that caught your notice, and recognizing your desire and need to learn from the situation, I will tell you that even if the sort of log you seek could be produced, it would not help you. Economic, company and data provider models of the past are such that if the sort of changes that occur throw an already-created simulation out of whack, the problem is likely to bed that the model left itself too exposed to randomness. Minor changes are part of the game. Big changes should not be.

I recommend you look more closely at the model for areas of potential vulnerability. The principles under which successful models work are explained in the Strategy Design on-line course posted in the Tutorials section. Also, you are welcome to show me your model – off line in order to preserve the privacy of your work – and we can work together to diagnose the situation.

davidbv · May 29, 2017, 3:25pm

I worked in the semiconductor industry for my career and my company developed its own logic block place and route tools. The tool was an inherent part of the customer design flow and required enormous ongoing investment. Almost 10 million lines of code.
The company’s revenue is over $2B a year and they deal with all of the major electronics OEMs worldwide.
They did reguarly publish information on changes with the tools but only to a certain level.
Their concerns about disclosure was protection of patents and trade secrets but also, a business decision about supporting the potential infinite number of support questions that will inevitably arise.

As P123 grows, I hope they do support specific issues like you are raising.
But, in the meantime, I have been happy with the verifiable and consistent results for stock selection. I run multiple ports and have not run into any major, inexplcable situations where the SEC data did not match with P123 or the engines logic itself broke down.

I would expect P123 to constantly run their tools to test for any internal changes they make BEFORE they publish those changes. I would feel better if P123 published a document, with revision control, that talked about how they test their tool and how they test with their databases. They don’t have to publish logs of actual results and they dont have to offer support to answer further questions.

I have had good luck with P123 so far. For me, I would rather them spend their resources on product enhancements.

My 2 cents.

Jrinne · May 29, 2017, 4:06pm

I cannot speak to this issue. I do a little R (making great use of the packaged programs) and I used to do a little Fortran and DOS. I added more memory to my computer once—so maybe I am more qualified than I think. Ha ha!

But P123 makes fewer mistakes than I do when I enter data into my Excel spreadsheets, it seems.

More importantly, they have literally considered tens of thousands of things in setting up the software and hardware that I have never considered. Answered questions that I would not have known to ask in doing what they do.

That does not mean that some things cannot be improved—and I am sure they will be. But there is a cost for this—one that I hope can be spread among an increasing number of P123 members. IMHO, P123 is about a great starting idea followed by continuous quality improvement. That continuous improvement has not stopped since the beginning (based on my experience and the old posts).

Which, I know, does not really relate to this particular concern. I do not know if some procedure could/should be modified, as I said. I just wanted to step back and look at the big picture for a moment.

BTW, does anyone know whether I still need to defrag my hard drive?

-Jim

dnevin123 · May 29, 2017, 5:01pm

Thanks to everyone for your responses, in particular Marc thank you for your detailed and thoughtful comments. I agree with effectively all of the points you have made.

Maybe it would help if I was a little more specific in my thoughts. The question I was trying to answer with my initial request was whether a change had been made in the way P123 is calculating a particular financial metric (say ROI%TTM). This was fresh in my mind as I have had it bite me in the past, and by really digging into the reason for the change I was able to improve my models. My hope was that by referring to a Change Log detailing whether changes had been made to this or that variable calculation I would be able to quickly rule out this type of change influencing model behavior.

Marc you mentioned the Recent Feature Releases Log, but in my experience changes of the above type are not necessarily included in that log. Is that a correct understanding?

If it is true that changes of this type are not listed in the Recent Feature Releases log, does a list exist of those changes? In my head it was something as simple as the following

04/15/2017 - Changed calculation of ROI%TTM to better handle case XYZ
03/07/2017 - Fixed bug in EMA calculation

I feel like this is a reasonable subset of the myraid possible changes that can effect P123 simulations, but a nice subset to understand and rule out. I wouldn’t be surprised if this was actually a null set over the last six months, but even that piece of data would be valuable.

mgerstein · May 29, 2017, 11:23pm

I’m not in a position to address the question re: the sort of log you demonstrate. But I am in a position to know that no changes of this nature capable of having a significant impact on models has been made in as far back as i can recall, if at all.

While I’m unable to speak more about the issue of logs, I do still believe you’re efforts can be more productively directed to the model itself, and I’d be happy to help any time you feel you want or need another set of eyes. . . but not tonight; as I said in another post, I have things to do before the Bachelorette comes on TV. Yes, I’m pathetic. I know. i know.