NEW: Data Miner App & P123 API -- v1.0 (beta)

Dear All,

The new Data Miner stand-alone app (built on top of Portfolio123 API) is now available for use. Data Miner is a Windows application for non-programmers. It can run thousands of unattended operations with ease, speed and reliability. Currently it features several data mining operations, such as rolling screens , rank performance tests, and rank downloads. Data Miner can also be used to download point in time factors (data license required). We’ll be adding several operations soon, so let us know what you think. In addition, we’re also releasing it as an open source project so you can create your own versions or, if you like, contribute to the official release.

This is version 1.0 so bear with us. We think it’s worth releasing it now because it has many nice features that can help you run comparisons between FactSet & Compustat.

You can download Data Miner in the link below. Be sure to download the samples and read the pdf.

[size=3]Dropbox - Data Miner - Simplify your life

Programmers: the API documentation can be found here
https://api.portfolio123.com:8443/docs/index.html

The Open Source project will be available soon here:

NOTE: You will need your own private API key. To generate it click on your picture on top right, then Account Settings → Subscriptions → API and click ‘Create’.

Awesome! Thanks for the hard work putting this together.

Marco,

Thank you! This has awesome potential (whether I end up being able to use it or not).

I have downloaded it and I have been able to use one of the samples (Ranks-inlined ranking system).

I have a question about labels. None of your samples provide labels: i.e., the returns.

Is that something that can be obtained without a data provider license?

Ultimately to be useful I will need the returns (or labels for supervised learning) and I will have to learn the indexing method to concatenate the returns of a ticker (for a specific week) with the ranks (for that week).

How is this indexed? I do not see what I normally consider an index. Will the P123 UID function as an index?

Ideally, the data would have a hierarchical row index of the date and the ticker for download. The factor ranks would be the column index (along with the label or returns of the next week). Ultimately, I would probably prefer to download the data and run it though Jupiter Notebooks, Colab or Spyder.

I could probably even hire a graduate student to help me with this if need be. So the details of how to do this may not be important in this thread.

Anyway, this is great! And thank you in advance for any information. If I cannot ultimately use this that is probably okay: the price I pay for not taking enough courses in programming. Although, I think you will be rewarded for making this usable for the average graduate with a finance degree (at the undergraduate level). I think you will want to attract people who want to run econometrics models that they learned getting undergraduate finance degrees which may not have involved a lot of programming.

For now my only question is whether a license is required to get data on returns (the label for supervised learning). If a license is requried, I will probably continue using what P123 already offers without spending a lot of time on learning how to use this addition now.

Thank you.

Best,

Jim

I get this error:

2020-05-12 11:39:27,152: API request failed: Your current membership only gives you access to historical data after 2015/01/01.

How come I can run backtests from 1999-2020 on the website and with the Data Miner it’s only 2015?

Other than that, seems really promising, excellent work.

I get the same error

“2020-05-12 12:15:19,888: API request failed: Your current membership only gives you access to historical data after 2015/01/01”

What’s the operation? The ‘Data’ operation is the only one that is not the same as your P123 membership since it requires a data license

Main:
Operation: RankPerformance
On Error: Stop

Default Settings:
Engine: Legacy
Vendor: Compustat
PIT Method: Prelim
Buckets: 5
Start Date: 2000-01-01
End Date: 2020-01-01
Rebalance Frequency: 13Weeks
Universe: S&P 500
Benchmark: SPY
Minimum Price: 1.0

Iterations:
-
Name: EBITDA/EV
Ranking:
Formula: EBITDATTM/EV

I was trying out the sample rank performance only. I posted after I got the same error and figured there was a bug.

Can you try again? Should be fine now

You fixed it, have a small question tho:

  •   Name: EBITDA/EV
      Ranking:
          Formula: EBITDATTM/EV
    

Is there a way to specify RankType (i.e. Lower) and Scope (i.e. Industry)?

Working here as well.

Thanks and this looks very interesting

Please take a look at the reference section in Dropbox : setting_ranking_nodes.yaml

NOTE: To edit input files in YAML format we recommend Notepad++ Downloads | Notepad++

All,

In the “RanksPeriod” example a member can substitute the name of a ranking system that they have already created as well as a universe they have already created.

This can be done over an extended period as the name implies (with dates in the column).

That is a lot of information that can be downloaded all at once.

It looks like this has a lot of potential. Some data wrangling will be required with version 1.0 but a lot of information can be downloaded already and there seems to be a lot of potential for the future version.

Thank you Marco.

Best,

Jim

Question:

Symbol makes information not processed in the tool

How do I process this factor:
#AnalystsCurQ

Do I just remove the # in front and it’s going to work correctly? If I leave it on, the way it is shown online on P123 I get this message:

2020-05-12 17:10:48,646: Invalid value for “Ranking” property in iteration #1: “Formula” property is invalid

To specify this, you’ll need to put the expression in quotes: [font=courier new]“#AnalystsCurQ”[/font]
If the expression also has double quotes in it, you’ll need to prefix those quotes with a backslash for it to work: [font=courier new]“FRank("#AnalystsCurQ")”[/font]

I can’t get your second tip to work, take this example it gets this error:
2020-05-12 22:50:41,089: API request failed: Element type “StockFormula” must be followed by either attribute specifications, “>” or “/>”. (on line 2)

Main:
Operation: RankPerformance
On Error: Stop

Default Settings:
Engine: Legacy
Vendor: Compustat
PIT Method: Prelim
Buckets: 10
Start Date: 2000-01-01
End Date: 2020-03-31
Rebalance Frequency: 4Weeks
Universe: S&P 500
Benchmark: SPY
Minimum Price: 1.0

Iterations:
-
Name: GR%PQ(“Sales”)
Ranking:
Nodes:
-
Type: StockFormula
Formula: GR%PQ("Sales")
Rank: Lower
Scope: Universe

YAML uses a few special characters, documentation on how to deal with them when present in property values (eg formulas) can be found in the README.txt on dropbox.
A new release that addresses a bug exposed by Quantonomics’ example has also been uploaded.

Hi Marco,

This is a very interesting development and shows p123’s continuing commitment to expanding its capabilities.

Regarding your following comment…

“Data Miner can also be used to download point in time factors (data license required).”

…can you say anything more about what CompuStat-related license terms might look like? With whom at S&P this can be discussed?

Thank you.

Hugh

They are not overly open to discussion to be honest. Expect a flat $24k licence fee.

Ok I saw the read me but I’m still puzzled. Can you please provide one example with this one below:

Aggregate(“EarnYield”,#Industry,#Avg,16.5,#Exclude,False,True)

What would be the right syntax in the Data Miner for this formula?

Thanks as this will make things more clear.