Index | Recent Threads | Who's Online | Search

Posts: 127    Pages: 13    Prev 1 2 3 4 5 6 7 8 9 10 Next
Last Post
New Thread
This topic has been viewed 3186 times and has 126 replies
InspectorSector
Re: Python code for calling 123 API

Marco - the more I think about it, the standard deviation rank would probably be a good thing to bring out. Thanks!

Nov 19, 2020 11:43:53 PM       
InspectorSector
Re: Python code for calling 123 API

Marco - I have another idea... instead of having this API call based on ranking systems, why not create a new factor for the screener and have the API call the screener instead? The new screener factor would be like ShowVar() except you could call it APIVar(). APIVar() would give an error when purely fundamental data is called out. But anything else would be allowed. The API call would only return data specified by APIVar(). The variable name specified in APIVar() would be the column header for the data returned by the API. Just a thought, I don't want to slow down your efforts though. Expediency is important.

With this solution, you aren't building in custom solutions to handle technical data.

Nov 20, 2020 1:00:48 AM       
Edit 1 times, last edit by InspectorSector at Nov 20, 2020 1:01:42 AM
piard2
Re: Python code for calling 123 API

Marco,
Your idea of adding extra data for technical features and labels is great. Some possible tweaks:

- It would be nice to allow the InList() function in extra data formulas. I understand we cannot export Sector and Industry without a license, but we should be allowed to simulate it by creating lists of tickers from ETF holdings (not PIT, but better than nothing for users who can't buy a license). I think I have also read in the forum that a kind of inETF() function to get PIT ETF holdings was in your projects. If it's true, it would be great to allow it in extra data formulas too.

- For convenience, it would be better to add the extra data columns on the right, after the rank's columns. Reason: in examples, tutorials and courses on ML platforms I have tried (python/sklearn and Azure ML studio), labels are usually in the last column of datasets. It makes table manipulation a bit easier.

- I don't know if someone will need the id column. It's easy to drop later, but dropping it in the output format may avoid taking it mistakenly as a feature (unless Steve or someone else thinks it is useful).

A clarification: I post my ideas to help, I will not be a ML user in the short term, but possibly in the future.

Nov 20, 2020 3:00:44 AM       
Edit 9 times, last edit by piard2 at Nov 20, 2020 4:01:01 AM
Jrinne
Re: Python code for calling 123 API

Marco,

So P123 is mostly there, I think. I do not really like DataMiner mostly because I can only use it at my office. And I had some coding issues for a while that Steve has helped me with. He says this code does a lot of what I want: RankData = client.rank_ranks({"pitMethod": "Complete",'rankingSystem': RankingSystemID, 'asOfDt': RunDate, 'universe': 'Digital Transformation", "includeNodeDetails": 'true’})

I cannot confirm that this will work because I am not at the office with a Windows machine. But I am pretty sure that I can get something to work with Steve’s help.

The only change I might want would be excess returns as discussed above. And honestly, I think you have a point. You might want to make it so that someone could use this with JASP and not have to ask Steve about the Python code. He is not paid to help everyone—not yet anyway.

I have asked for some help on this in the forum many times and Yuval was not able to offer anything more than me paying for a one time download of data. I think you should figure something else out if you want to attract a lot of new customers.

You are probably aware that as long as the data is there in a column, Python can find it.

For example here is the code of setting up the training data:

f1_train=f1[[‘Factor2’,'Factor3’,'Factor4’,'Factor5’,'Factor6’,’Factor7']].values
f1_label=f1[‘ExcessReturn'].values

Notice I have left out Factor1 as I have found it isn’t really predictive. But I can easily add it back and remove Factor5 if I want to. You understand this better than i do.

But I do not think where you put the columns to be an issue.

If there were no download issues you could have multiple labels. I could easily change the label using Python to f1_label=f1[‘P123sFavoriteTechicalLabel’] on the same download.

Again, I understand that you know this better than I do.

Anyway, I appreciate that you are there or almost there. I agree that making more labels available might be helpful. I do not know if DataMiner is the best program or not.

I suspect that there are a lot of people already using this. For whatever reason people highly skilled in Python do not post much. And they NEVER discuss what they are doing.

But for every person using it here I think there are thousands better skilled at Python than your average P123 user.

I guess it remains to be seen whether you can market to them. But I would not use your average forum poster as a gauge for the potential of this business model.

I cannot imagine that there is a great cost to this but if it does not bring in new customers and is not worth it you should abandon it.

It is only my personal opinion that there are good number of people like Steve, Frederic, the silent people on the forum and me waiting to be marketed to. My apologies if I am wrong about that.

But I would recommend perfecting this in whatever way you think is best and try some marketing of this. You can always abandon it later.

FWIW in gauging interest in this, Steve is not the first member who has asked me to help him build a neural net. Steve is just the first who wanted to discuss it in the forum.

You have already thanked Steve for sharing—rightfully so.

Best,

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Nov 20, 2020 4:51:30 AM       
Edit 2 times, last edit by Jrinne at Nov 20, 2020 5:03:52 AM
InspectorSector
Re: Python code for calling 123 API

Marco / P123 staff - I get a request quota exceeded when I try to run my code now. This is happening on the first API call and has been happening since yesterday afternoon. What are the actual resource limitations? I thought it was 100 per hour or something like that?

Nov 20, 2020 7:12:49 AM       
InspectorSector
Re: Python code for calling 123 API

It would be nice to allow the InList() function in extra data formulas.


Marco - the problem here is that you have introduced a new piece of functionality with the extra data. The requests for new data items is going to grow with time. I have several that I would like as well, including FMedian(), FOrder(), Aggregate(), etc.

I originally based this exercise on the Ranking System module, because it allows historical access without a data license. I thought that would be the minimum impact from P123's perspective. But since you are going to this level of customization, I think that my preference would be to use the screen module, but being able to access APIVar() as I described in a previous post. APIVar() would be identical to ShowVar() in every respect except it would only allow items that are not raw fundamentals. It would allow such things as FRank(), FOrder(), FMedian(), Aggregate(). FSum(), industry factors, any technical formula, any InList() operation, and other APIVar() variables as part of a formula.

This provides maximum flexibility and we won't be coming back for more requests for other types of data, because we already have access to everything except for raw fundamentals. To make this work, the API call would be able to run on historical dates. Also, the column header should have the @ stripped off at the front of the variable name. There are some programs that don't allow such symbol. MySQLi for example croaks when presented with an @ in the column header.

Also, my opinion is that 5 years of weekly data is needed as a minimum to make AI work. So either one API call should concatenate all of the dates into one returned array or think about how you can structure resources versus P123 membership to make it happen.

Thanks!

Nov 20, 2020 8:39:52 AM       
Edit 4 times, last edit by InspectorSector at Nov 20, 2020 8:55:07 AM
marco
Re: Python code for calling 123 API

Steve , I bumped your API limits to 5K. We have not formally launched the API and still tinkering with limits.

Your proposals can create lots of backdoors for downloading data. We can't have backdoors ( it would not remain a secret for long ). So trying to keep this simple initially. We should start thinking of what an API specifically for generating input data for ML should look like . But right now the quickest way to get most of the way there is to make some mods to the existing rank API. Using the ranking system API simplifies things a lot for us since there's no way to download raw data with it .

Lets see where this goes. I'd also want to try it myself . Maybe you want to do a demo for me and others? SIDE NOTE we want to find ways for expert users on p123 to do way more than just Designer Models. This stuff is not easy, DYI investing is not easy, and having an army of experts (that get paid somehow of course) is the way to get 123 going somewhere at last.

Jrinne, you lost me there a bit with few things. You don't need a windows machine. Python and DataMiner runs on macs, linux, and windows. Also, not sure what you referring to re. JASP & one time data download.

Thanks

Portfolio123 Staff.

Nov 20, 2020 11:11:38 AM       
InspectorSector
Re: Python code for calling 123 API

Thanks for upping my API limit. Is there an hourly limit? My optimal application would be for 5 years of weekly data. With my current software, I have to make 5 x 52 calls so that is 260 API calls to collect a set of data. I can change this to monthly frequency but the end result won't be as good.

As for help, I will gladly help in exchange for the perks such as the increased API limit. Is there any chance of setting up a separate forum so that the regular P123 folk are not bothered by this activity? I don't know about a demo, but I can supply everyone with most of my Python code and can assist with any problems with Python or Google Colaboratory.

I would like you to think about the possibility of a new type of membership for P123, Big Data or ML. Dedicate much higher resources to API calls and defocus other resources such as portfolios and sims. Right now, I am only using a handful of portfolios for example.

Nov 20, 2020 11:59:28 AM       
Jrinne
Re: Python code for calling 123 API


Jrinne, you lost me there a bit with few things. You don't need a windows machine. Python and DataMiner runs on macs, linux, and windows. Also, not sure what you referring to re. JASP & one time data download.

Marco,

Thank you for your response.

I did not know that DataMiner runs on a MAC!!!! So thank you very much.

I already have Python (using Anaconda) up and running on my MAC.

So with JASP I just need a Excel csv file which I now understand that I can probably get this weekend at home without staying late at the office using DataMiner.

And pretty good column headers thanks to Steve’s help. Frankly, however, I doubt that it will be useful until I can get excess returns as a label. Still, this is all good news to me.

I'd also want to try it myself . Maybe you want to do a demo for me and others? SIDE NOTE we want to find ways for expert users on p123 to do way more than just Designer Models.


I understand that this was directed toward Steve and I assume he can help you.

But I was thinking today that I might be able to give a simple demonstration with JASP that you should be able to duplicate. I think I can and you could move to XGBoost or TensorFlow later. If I get something that I can easily demonstrate then you can upload my Excel File or duplicate what I have done with your own factors.

My main question with JASP is how large of a data volume it can handle. But I think it is capable enough to hint at what XGBoost can do. The main advantage for now being that you should be able to be up and running quickly to test it if for yourself without having to use any data that may be cherry-picked by me.

I do machine learning pretty well (I think) but not so much on some Python munging tasks sometimes. I appreciate our help. Thank you again for the information about DataMiner on MACs!!!!!!!

Best,

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Nov 20, 2020 12:08:22 PM       
Edit 10 times, last edit by Jrinne at Nov 20, 2020 1:17:00 PM
danparquette
Re: Python code for calling 123 API

Hi Steve, It would be helpful if you could give me more details on your typical use case so I can use that to test the calculations for the cost in 'requests' and make sure it is reasonable. You mentioned 5 years of weekly data. How many tickers? How many ranknodes(aka factors/formulas)=?

Dan

Nov 20, 2020 1:27:53 PM       
Posts: 127    Pages: 13    Prev 1 2 3 4 5 6 7 8 9 10 Next
 Last Post