Index | Recent Threads | Who's Online | Search

Posts: 26    Pages: 3    1 2 3 Next
Last Post
New Thread
This topic has been viewed 862 times and has 25 replies
marco
Nice ML integration on FactSet

Here's a webinar of DataRobot on FactSet

Very slick ML integration. Looks like the "AutoPilot" mode just goes through all 87 (gasp!) models to find the best (isn't this curve fitting??). Nice presentations of feature impact. Obviously running the data through all 87 models is excessive but I'm guessing that's how DataRobot makes money : by enticing you to test more models.

Another good webinar is this one Machine Learning for Quant Investing with DataRobot on FactSet

A good primer is this one Five Lessons On Machine Learning-based Investment Strategies

We will be investigating our own ML integration so let us know what you think. In simple terms initially I see it as an alternative to the ranking system. If we rely on an ML cloud services, like DataRobot, the integration will be much faster. Most of what you see in the video is made by DataRobot. But there are big advantages to hosting our own ML system. For example you could use the data points directly w/o a data license since the data would never leave our servers, and no downloading would be needed. BTW, DataRobot offers $500 credit with a registration.

So FactSet seems to be embracing ML. Not sure what are the costs are. What are others doing ? Have you seen any other integrations of ML with investing?

Portfolio123 Staff.

Jan 11, 2021 12:34:52 AM       
Jrinne
Re: Nice ML integration on FactSet

Marco,

Thank you for the link.

Notice DataRobot uses XGBoost! "Enterprise-grade open source" in the video. One of only 6 program (and computer languages) included. Looks like Steve is onto something.

And XGBoost is one of the models that he uses in his example.

BTW, is P123 now the premier AI/ML site for retail investors? I think so, but I would not stop here. I do not know exactly where to go but I am sure there are some great ideas at P123 and in the community.

And then there is marketing. Maybe P123 could have a video of DataSet’s data with XGBoost for retail investors if DataRobot can do it for institutions. Maybe Steve and Hem could moderate it.

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 11, 2021 4:17:42 AM       
Edit 11 times, last edit by Jrinne at Jan 11, 2021 5:21:24 AM
Jrinne
Re: Nice ML integration on FactSet

Looks like the "AutoPilot" mode just goes through all 87 (gasp!) models to find the best (isn't this curve fitting??).

Marco,

I think you need to get beyond this.

It is okay if people posting on P123 do not get it and do not want to use P123’s full machine-learning potential—including the methods to reduce overfitting. P123 has other things to offer.

But done right machine-learning reduces overfitting—unlike the methods used creating most of the designer models.

The video talks about "out-of-sample validation." And he uses a hold-out test set in the video. As the owner of the premier machine-learning site for retail investors in the world you might consider promoting P123’s ability reduce overfitting

Best,

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 11, 2021 4:42:56 AM       
Edit 5 times, last edit by Jrinne at Jan 11, 2021 5:14:36 AM
Jrinne
Re: Nice ML integration on FactSet

BTW, I find what Steve has done over at Colab to be a little more user friendly in many ways. As you would expect from the premier machine-learning platform (for retail investors) in the world. Although what they have done is both incredible and pretty intimidating if we actually want to keep up.

I started to promote using XGBoost and machine-learning here at P123 years ago when I saw that FactSet was already using XGBoost. i realized that keeping it a secret at P123 was not going to give me any advantage.

The smaller user-base at P123 is not my competition. It is the large institutions with their magnitudes greater money and resources that are our real competition. The only way for any of us to compete is easier access to the data.

Easier that it is now should be a goal, I would think. The API is getting better almost daily and I hope improvements keep comming.

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 11, 2021 5:00:24 AM       
Edit 3 times, last edit by Jrinne at Jan 11, 2021 5:25:33 AM
Jrinne
Re: Nice ML integration on FactSet

Wow!!!

Just focus on XGBoost for now. And your strength: fundamentals. You have some work to do.

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 11, 2021 6:02:48 AM       
Edit 1 times, last edit by Jrinne at Jan 11, 2021 6:05:40 AM
InspectorSector
Re: Nice ML integration on FactSet

I didn't have an in depth look at what they are doing but while they separated In Sample from Out of Sample, they did not mention anything about data leakage. They should be separating the training set and validation set by at least the prediction period. And the same goes for the In Sample versus Out of Sample. They are not doing the latter at least. It becomes an expensive toy.

Jan 11, 2021 9:46:25 AM       
Jrinne
Re: Nice ML integration on FactSet

I didn't have an in depth look at what they are doing but while they separated In Sample from Out of Sample, they did not mention anything about data leakage. They should be separating the training set and validation set by at least the prediction period. And the same goes for the In Sample versus Out of Sample. They are not doing the latter at least. It becomes an expensive toy.

Steve an Marco,

So I want to go with this. I think they did mention leakage at one point. And we do not know what options there may be in the program for controlling the train, validate and test sets.

BTW, they did show a walk-forward validation which I think is adequate. P123 can add an "embargo" it Steve and P123 feel a need.

But I AM GOOD WITH THE IDEA that we can do better.

Marco, Steve is talking about things here that mitigate and can eliminate overfitting.

And again, agreeing with Steve, let's beat our flawed competition! Or seek to provide the equivalent with XGBoost.

Steve can show all of us that XGBoost is not really that hard of a programming issue and show how easy it is to optimize with a grid search.

Best,

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 11, 2021 10:13:42 AM       
Edit 1 times, last edit by Jrinne at Jan 11, 2021 10:27:34 AM
philjoe
Re: Nice ML integration on FactSet

Just FYI i haven't been able to use XGBoost to improve any of my strategies :(

Jan 11, 2021 10:30:39 AM       
marco
Re: Nice ML integration on FactSet

Steve, pretty sure I saw a demo where they handle leakage, or have ways to do it. DataRobot is very well funded and the real deal.

Jim, not sure we should aspire to "be better". Just different , with a subset of the features for sure, and more useful and affordable for investors.

In other words our goal should be to add ML to our arsenal in a similar way that P123 is doing for mechanical, factor & rule based investing.

Portfolio123 Staff.

Jan 11, 2021 10:32:36 AM       
Jrinne
Re: Nice ML integration on FactSet

Steve, pretty sure I saw a demo where they handle leakage, or have ways to do it. DataRobot is very well funded and the real deal.

Jim, not sure we should aspire to "be better". Just different , with a subset of the features for sure, and more useful and affordable for investors.

In other words our goal should be to add ML to our arsenal in a similar way that P123 is doing for mechanical, factor & rule based investing.

Marco,

I agree completely. I do think that XGBoost is a known thing and not really that complex. I do think you could duplicate what they do with with XGBoost (and add some additions methods). Especially if you focus on fundamentals. Or Steve could over at Colab. Or whatever direction is best with that. But getting XGBoost to produce good solid models with fundamental data from P123 can be done.

Jim

From time to time you will encounter Luddites, who are beyond redemption.
--de Prado, Marcos López on the topic of machine learning for financial applications

Jan 11, 2021 10:43:49 AM       
Edit 2 times, last edit by Jrinne at Jan 11, 2021 10:46:26 AM
Posts: 26    Pages: 3    1 2 3 Next
 Last Post