Index | Recent Threads | Who's Online | Search

Posts: 127    Pages: 13    Prev 1 2 3 4 5 6 7 8 9 10 Next
Last Post
New Thread
This topic has been viewed 3179 times and has 126 replies
marco
Re: Python code for calling 123 API

Question.

Since we can only let you download ranks, but not the data point itself....What if we add more ways to rank , like normal distribution, pareto distribution, etc. And allow you to specify which ranking method to use for each factor/formula

Wouldn't this be just as good as having the data itself ? Would this facilitate the ML studies you are doing?

Portfolio123 Staff.

Nov 18, 2020 12:48:47 PM       
Jrinne
Re: Python code for calling 123 API

Marco,

XGBoost is amazing with regards to what you are asking. IT JUST NEEDS TO HAVE THE ORDER OF THE FACTORS PRESERVED.

So whether the factors are: raw data, ranks, Gaussian distributions etc IT MATTERS NOT AT ALL.

It is a non-linear and non-parametric method that can have the factors transformed in any way as long as the order is preserved AND GIVE THE EXACT SAME ANSWER.

Theoretically this applies to neural nets too but is GUARANTEED BY THE WAY A REGRESSION TREE METHOD LIKE BOOSTING WORKS.

In short, FOR BOOSTING RANKING IS ALL ANYONE EVER NEEDS!!!! That as well as the returns and a usable index.

Sorry about the caps but if one is not excited by Boosting they just do not understand it. IT IS EXCITING TO SEE PEOPLE LIKE STEVE USING THIS.

Thank you for your interest.

Best,

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Nov 18, 2020 12:58:17 PM       
Edit 5 times, last edit by Jrinne at Nov 18, 2020 1:43:55 PM
InspectorSector
Re: Python code for calling 123 API

Marco - I would have to think about that. The biggest issue for me is the target not the inputs. The current ranking system for inputs is probably adequate, but worth experimenting with for some of your suggestions. If your ideas help with generating a target that is closer to the real data then I'm all for it. I don't know enough at present to say whether your ideas will help or not.

I would also like to remind you of my previous feature request. That is to allow negative time period offsets. That will make it much more straight forward to use one set of equations for generating the training data and for generating inputs for future predictions.

Thanks!

Nov 18, 2020 1:02:39 PM       
Jrinne
Re: Python code for calling 123 API

Marco - I would have to think about that. The biggest issue for me is the target not the inputs. The current ranking system for inputs is probably adequate, but worth experimenting with for some of your suggestions. If your ideas help with generating a target that is closer to the real data then I'm all for it. I don't know enough at present to say whether your ideas will help or not.

I would also like to remind you of my previous feature request. That is to allow negative time period offsets. That will make it much more straight forward to use one set of equations for generating the training data and for generating inputs for future predictions.

Thanks!


Steve will correct me on this if am not aware of everything he is doing. When we correspond I am not always aware of the nature of his inputs and Targets. They are often labelled Input1, Input2…. and Target. This is good. I am not looking to learn other peoples factors.

I think he uses different targets than I do, however.

But one usable target is just the next weeks returns (for a weekly rebalance). This can then be sorted and one can buy the 5 to 25 stocks with the best predicted returns for next week.

Steve does seem to agree that the ranks (for inputs or predictors) are all that are needed for the factors. I am just going to say this is absolutely true.

This is easy to prove.

Best,

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Nov 18, 2020 1:05:21 PM       
Edit 8 times, last edit by Jrinne at Nov 18, 2020 1:13:56 PM
InspectorSector
Re: Python code for calling 123 API

One of the pre-processing ideals for neural networks is to prepare the inputs so that there is equal distribution across the range of input, or as close as possible. The ranking system algorithm probably does a good job in that regard. And as Jim says, xgboost doesn't care about linearity. It's more of a concern for tensorflow. The main issue that I have is recovering a target that somewhat resembles what I am trying to predict, not a 0-100 rank. So maybe there is some distribution trick that can be used to get better results.

Nov 18, 2020 1:15:11 PM       
Jrinne
Re: Python code for calling 123 API

One of the pre-processing ideals for neural networks is to prepare the inputs so that there is equal distribution across the range of input, or as close as possible. The ranking system algorithm probably does a good job in that regard. And as Jim says, xgboost doesn't care about linearity. It's more of a concern for tensorflow. The main issue that I have is recovering a target that somewhat resembles what I am trying to predict, not a 0-100 rank. So maybe there is some distribution trick that can be used to get better results.


Correct. Which is already done by a Rank!!!! Ranks do as Steve suggests: "there is equal distribution across the range of input. Or perhaps it is better to say that all inputs are scaled or normalized the same.

There are some who would divide the rank by 100 to make all of the inputs between 0 and 1. This can be done by Colab or Python or in a spreadsheet. So P123 would not need to make this a priority.

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Nov 18, 2020 1:18:48 PM       
Edit 2 times, last edit by Jrinne at Nov 18, 2020 1:46:28 PM
marco
Re: Python code for calling 123 API

Any thoughts on other ML platforms that may be more suitable , easier to use to non AI experts (me for example) ? Chaikin uses https://www.r2.ai/ and he must have just learned it

Portfolio123 Staff.

Nov 18, 2020 1:48:33 PM       
Jrinne
Re: Python code for calling 123 API

JASP is a free download.

It provides boosting, K-Nearest Neighbors, Random Forests, and Regularized linear regression. For both regression and classification. As well as unsupervised machine learning methods.

In addition it provides Bayesian methods.

I use if for Factor Analysis where it does as well as SPSS.

The ML portion is constantly being upgraded in JASP which provides frequent upgrades.

It is already usable. And is menu driven. NO PYTHON PROGRAMMING!!!

Marco, you should provide an easy download into a csv file that is immediately ready for upload into JASP.

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Nov 18, 2020 1:55:02 PM       
Edit 2 times, last edit by Jrinne at Nov 18, 2020 1:59:09 PM
marco
Re: Python code for calling 123 API



Marco, you should provide an easy download into a csv file that is immediately ready for upload into JASP.

Jim


from where? time series? what data? multiple periods?

example please.

Portfolio123 Staff.

Nov 18, 2020 2:03:11 PM       
InspectorSector
Re: Python code for calling 123 API

There isn't really enough technical info on the r2.ai site to say whether it is a good product to use. I was in the same state as you are now and reached out to Jim as I know he was using tensorflow and xgboost.

It looks to me as if r2.ai has a certain price but you may find that AutoML has usage fees in addition to what r2.ai charges. Another consideration is whether you want to base your efforts around a company that may disappear in the future... Think Quantopian.

The big decision is whether r2.ai or other company allows you to run an external program (your project code) to invoke their product and deliver results back to your program. With some products that I have seen, you actually have to power up the application, run the neural net, save the results somewhere in a file, then you run your external application and read the file. It is doable but will be a pain in the neck long term and especially if you are running a lot of NNs.

The advantage of tensorflow and xgboost is that they are free and there are no usage fees that autoML may invoke. There is little danger that Google will disappear. I believe that xgboost is open-source or is at least a standard import for Colab. The big thing is that you can run your own project code, whatever that may be... I am using Python, invoke the AI software and have predictions returned directly for my code to process and use. I can use the results directly as they are in arrays, not stored on disk somewhere. There is no need for a multi-step manual process.

I was in your situation a few weeks ago. I wanted a NN program with a nice GUI that allowed me to run things confidently. But I have gotten past that. I have Python code. If you want I can give you that (if Jim is OK with that) and it will get you up and running quickly. You have to learn Python though. Its not much different than any other programming language.

The nice thing about xgboost is that it is orders of magnitude faster than tensorflow, making some applications possible that I can't do with tensorflow. I also find that it is pretty robust relative to tensorflow, based on the tests that I have been performing. xgboost is some sort of decision tree algorithm as opposed to a NN. I just treat them both of them as black boxes.

Nov 18, 2020 2:17:15 PM       
Posts: 127    Pages: 13    Prev 1 2 3 4 5 6 7 8 9 10 Next
 Last Post