Index | Recent Threads | Who's Online | Search

Posts: 127    Pages: 13    Prev 2 3 4 5 6 7 8 9 10 11 Next
Last Post
New Thread
This topic has been viewed 3190 times and has 126 replies
InspectorSector
Re: Python code for calling 123 API

Thanks Dan - I understand that this activity may place a strain on P123 systems. I'm hoping that the new API that P123 comes up with will concatenate all historical dates into one call. If that happens then API call usage is minimal and not a big concern.

Right now, I am evaluating 5 years of monthly frequency to see how that performs with the AI models and maybe that will be enough. But I am assuming weekly data is required...

The following usage is for model development. Once the model is developed and deployed then API calls will be minimal, like maybe once or twice per week per neural net. I am expecting to develop ~10 neural networks in total over time. Once the first NN is developed then I will be moving on to the next.

Per neural net:
- Preferred - 52 weeks x 5 years (number of API calls with weekly frequency)
- Less desirable - 12 months x 5 years (number of API calls with monthly frequency)
- Inputs or rank nodes: 5-10
- Target: 1
- 300 tickers

Note: The problem is that the development is an iterative approach, finding and testing inputs. I don't know how many iterations that it will take to get the exact inputs that I want to use in the final model. I could structure it so that I could guess at what inputs might be useful and collect many more inputs than I think I need, thus minimizing the overall API-call usage. In that case, I would increase the number of input nodes to something substantially higher than 10.

And as I said, in the long run I hope to develop on the order of 10 NNs. I don't know how long each will take. It could be spread out over months or if I burn the midnight oil it could be a few weeks.

Hope this helps!

Nov 20, 2020 1:52:51 PM       
InspectorSector
Re: Python code for calling 123 API

Dan - I don't know where the exact issues lie, whether P123 is paying a third party usage fees, or if it is a question of taxing the P123 servers. I just want to mention that Quandl dealt with the problem of servicing massive numbers of API calls by introducing a two-step process. The request for data is made, then sometime later the data is available for download. Initially they were talking about a 12 hour wait period for the data to be ready, but eventually it turned out to be like 30 seconds to a minute. Something like that would be OK for me for historical data, even 12 to 24 hours after making a request API. (not for the current week however).

Keep in mind that literally every software company in the world is introducing AI into its product stream. I think that P123 should be thinking along those lines as well. But to be successful, P123 needs to have the resources for Big Data available.

Nov 20, 2020 2:33:05 PM       
danparquette
Re: Python code for calling 123 API

Steve - We are still finalizing the formulas for calculating the request costs and I have not seen any requirements yet for the new API endpoint you and Marco have been discussing, but based on how we are calculating the costs for the other endpoints, 300 tickers and 10 nodes should not be a problem even with multiple iterations. Worst case is if you were to increase the scope of your project and need more requests allocated to your account during your initial development phase of your NN's, then you could purchase additional requests with an AddOn that will be available soon at a reasonable cost.

Dan

Nov 20, 2020 3:31:21 PM       
InspectorSector
Re: Python code for calling 123 API

That sounds good! I am going to use monthly data to start and when I get to finalizing my NN I'll switch to weekly. That should reduce any demands on the system.

Nov 20, 2020 3:42:59 PM       
mm123
Re: Python code for calling 123 API

Glad I found this thread. I'd like to share the results of my Strategies and Books, updated daily via some api, etc. What's the best way to do this?

Nov 21, 2020 11:22:23 AM       
marco
Re: Python code for calling 123 API

Glad I found this thread. I'd like to share the results of my Strategies and Books, updated daily via some api, etc. What's the best way to do this?


We don't have an API endpoint to get the stats yet. Should be pretty straight forward to add. What stats do you need ?

Portfolio123 Staff.

Nov 22, 2020 8:29:29 AM       
piard2
Re: Python code for calling 123 API

Marco,
Recommendation to include if you do an educational doc about how to use the P123 ML API:
Users must be very careful how they split the dataset in training and validation sets. Training and validation sets ("simple way" or K-fold) MUST be on SEPARATE TIME PERIODS.

Some ML tools split data by default in RANDOMLY INTERTWINED sets. For example, I think it is the default parameter in sklearn for one of the most used dataset-splitting function. It is the best way for many ML applications, not when using timeseries! P123 ML users have to set the correct parameters or do their own function to prepare data so that training and validation sets are not intertwined in time, and if possible SEPARATED by an UNUSED data period. Else the model will be trained with massive data leakage, resulting in widely overestimating its predictive ability.

To understand the problem, just think you will have lots of records with almost identical features and labels in the train set and the validation set if they are randomly intertwined on a daily basis: same fundamental features (rank), almost same technical features... and almost same label (forward return).

To be sure to get rid of data leakage, ideally the training and validation sets should be separated by a non-used period of one quarter (not to use fundamentals from the same earnings reports in ranks) and/or the longest look-back period used in technical features (maybe half of it should be enough). Hope it's clear, else let's make a call ;-)

Nov 22, 2020 11:00:13 AM       
Edit 5 times, last edit by piard2 at Nov 22, 2020 11:20:05 AM
InspectorSector
Re: Python code for calling 123 API

There is certainly lots to think about with regards to use of the data. But let's not have P123 bogged down in writing application notes at this point in time. Let them focus on getting the data out in one simple 2D array with column headers. The ML application of the data can be left to third parties but that won't happen without the data first.

Nov 22, 2020 11:23:29 AM       
mm123
Re: Python code for calling 123 API

So Marco, basically the stats on the Summary, Holdings, Statistics & Charts pages. Is there somewhere on this website where I can find instructions on how to easily grab that data?

Nov 22, 2020 11:28:19 AM       
Edit 1 times, last edit by mm123 at Nov 22, 2020 12:31:50 PM
piard2
Re: Python code for calling 123 API

There is certainly lots to think about with regards to use of the data. But let's not have P123 bogged down in writing application notes at this point in time. Let them focus on getting the data out in one simple 2D array with column headers. The ML application of the data can be left to third parties but that won't happen without the data first.

Not lots. Data leakage is a major issue with timeseries in ML. I know Marco is very interested in how data will be used, because he told me so. A nice API without clear guidelines may do more harm than good.

Nov 22, 2020 11:50:00 AM       
Edit 2 times, last edit by piard2 at Nov 22, 2020 11:59:18 AM
Posts: 127    Pages: 13    Prev 2 3 4 5 6 7 8 9 10 11 Next
 Last Post