Key Projects Update: AI, Europe, and Request For Feedback

Dear All,

Here are few key projects updates

AI/ML

As you may be aware we are going to add Machine Learning soon. In a nutshell it will allow you to specify any number of factors or formulas, select a universe, and train a model to predict a future factor (for example the 3 mo return, the future sales growth, etc). The trained model could be used in different ways. It could completely replace your ranking system or work alongside it. You will also be able to access the predicted values in rules , the screener, and historically. This project is well underway and should have a working prototype very soon. The AI will be completely integrated to the existing website.

Europe

We’re gearing up to add European data from FactSet. We’ve been working on adding multi-region/currency support to our pages and will be getting the European feed soon. We’ve worked out most issues and do not anticipate a long delay once we start getting the live data (1-2 months max).

Request For Feedback

What are your thoughts on being able to use ML to label a chart ? For example, being able to train a model to tell you if you “like” or “dislike” a chart, or whether a certain pattern is present. To do this with formulas is very difficult. Neural Nets should be able to achieve much better results.

The general tool we’ve been brainstorming is a way for you (or collaborative group) to quickly browse through many charts and apply labels. These labelled charts then become the training data for a NN model. There are several features that are yet undecided. For example if you are training a model to find “price consolidation after a run up” pattern, all you probably want & need is a simple line chart (the simpler the better). But for other patterns you may need volume, indicators like Bollinger Bands, etc. It’s quite possible that the right chart for an ML system is something entirely new that is not very useful to humans.

We are exploring this now and would like to hear your thoughts on this and anything else you have in mind.

Cheers, and Thank You

I hope that the ML features will be usable for non-programmers. And that it’s worth the effort: If it works at all, and most of the trading is done on the basis of such programs, the worst / slowest will fall by the wayside.

Matthias

Marco and P123,

WOW!!! This seems incredibly advanced and includes neural-nets. Just Wow. I am otherwise speechless.

I like the chart-pattern recognition idea. The broader question is: how would I use pattern-recognition that P123 might possibly find interesting?

Personally, if I were part of a group of great programmers-which I guess I am on this forum–I would want to look at trying some web scraping and have the NN get an idea of sentiment.

This is not an example of sentiment exactly, but a concrete example that de Prado mentions is web scraping different sites to get real-prices of real-things people buy to understand how much real inflation there is.

I understand that (in this post) you are talking about pattern recognition and not word-reading. Trying to think of real (probably not too usable) ideas, Fidelity has a graphic of what its users are buying and selling–with only the need to recognize a ticker.

Zacks’ site is not-too-graphic but just being able to recognize the ticker will tell you what some people following Zacks rank and Zacks’ VGM will be buying today. I could not quickly find what Robin Hood is doing on their site.

And to be clear, the the neural-net may (after scraping the net) decide some of Zack recommendations are good shorts. I am not recommending Zacks here.

For your AI programmer, I am investing 20% of my portfolio into what could best be called a categorical Naive Bays classifier now. Sounds complex but this is one of the easiest and simplest AI methods with the least amount of computer usage (Occam’s Razor from my point of view). He could easily use a Naive Bays classifier with whatever sentiment data he collects (wherever he may get it). And your server probably would not notice the usage.

This is in the spirit of what Fidelity does with Starmine. I do not fully understand Starmine’s proprietary method but I am also not clear that it is any better than Naive Bays at the end of the day. It is clearly quite similar. You would not have to reinvent the wheel if you use this approach. And there is no proprietary interest in the Naive Bays method. Just call it [color=coral]GoldMine.[/color]

Of course, [color=coral]GoldMine[/color] would allow one to select the input (of sentiment) used for the Naive Bayes classifier–as Starmine has some ability to do. And then be put into an InList perhaps. Or not, just an idea at this point. But I would recommend looking at Starmine in depth and getting some ideas there.

This is just an idea–one that is almost certainly not all that useful at this point. But if you have a neural-net that can recognize charts you have the ability to get pretty creative!!! A little “brain-storming” may be appropriate. And members will almost certain submit some ideas that will ultimately be usable for P123 as we learn more about the direction P123 is heading.

And for those who are not sure about AI: DIDN’T YOU JUST SAY EUROPEAN DATA?

Speechless (well for me). Should work for P123 I would think.

Jim



:slight_smile:

Marco, I’m interested to see how the training works and would be interested in participating in the training. For my part, I’m not a big chart user or conversant or schooled enough in technical terms to describe what I sometimes see - and I could just be seeing things that are irrelevant or just randomness… But sometimes it does seem like I see recurring patterns in price data.

I am interested in concepts such as “support” and “resistance” and probably use these ideas as much as any even though I’ve had no way to really test them. Also, I very much like log scale on charts so I can better gauge a constant rate of growth - especially over longer periods of time for fast growers.

Attached are some screenshots of a pattern I fell like I see more than I’d expect to. I don’t know what it’s called, sortof a several month period sideways price movement characterized by lower highs but generally bouncing off of a price low. I don’t know if it’s predictive of anything, but it seems like it often resolves in definitive way with price moving up or down in a non-incremental way. (charts COKE, ABST, SPSC)

I’ve attached an example of a support-like look I sometimes see (or inverse being a resistance-like look). Again, I don’t know if there’s anything predictive to them, but seems breakdown (or breakouts) from those established levels may have meaning and would be nice to train/test. (Chart QGEN shown)

Also the strong trenders with low variance in the trend reflect an interesting type of chart to me that I see sometimes. (chart for GOOGL and FTNT shown).







Hi Marco, I wanted to add this example for possible support/resistance level example using CDNS. If I’m reading it right, shows what might be seen as an initial resistance line from Feb/Mar/April, then a breakout in Aug, retest shortly after in Aug, then another pullback and retest again in October. Again, I have no idea if these possible resistance/supports levels have some predictive power, but occasionally I see stuff like this and wonder because it seems some price levels are more important than others when this plays out. Could just be chance, and I could just forcing some unwarranted pattern or order on things.


Marco,

My apologies for getting away from your question about pattern recognition of charts. I am excited about this and got off of the subject.

You have some questions about labeling chart (technical) data or the best way to do it.

I think I would take a different approach than what you suggest in this post. Or supplement this approach.

I would search the data you have and find a period where a stock has a good return. That return is the label.

I would then move backward in time and let the neural-net find the pattern before the period of good return. You may have to include some times where returns were average or poor too. Or select some random periods. But neural-nets do not always have to have completely balanced data.

This has several advantages:

  1. Neural-nets need a lot of data. You can get more data by automating this.

  2. P123 members may not like (and label) the best patterns or even know what they are.

3) A good neural-net will be able to find patterns that we members have not read about yet.

  1. Perhaps this could supplement any human labelling you use. Not an either/or thing. More data is always better with neural-nets. The improvement with more data never stops; it just slows down. There may be some data on other sites the computer could look at too. More is better. It would be wrong to reject any source of data.

Neural nets are a thing mainly because we now can get huge amounts of data (which they generally require). Think of all the data on the web, whether it is searches at Google or even pictures of cats on YouTube. Or think of reinforcement learning where a computer can play itself in a game forever (or as long as it takes).

The possibility of having a NN find new patterns that are not generally known is pretty interesting. As well as objectively defining the best patterns free from any human labelling-bias.

Very cool whatever you do with this.

Jim

I’m very happy to hear about these new additions - especially the European data. Thanks!

SpaceMan, try using weekly bars and Heikin-Ashi candle sticks . I think it makes trends more obvious to a human which can only be a good thing for AI. Here’s a few of your examples I did on stockcharts (we will be adding weekly bars & technicals on P123 very soon)

Jrinne, we’ll be looking into unsupervised or automated learning as well. But initially probably best to do manual labelling to get a sense of what is happening. As far as textual AI for headlines or sentiment is something also down the road. Need to learn how to walk first.

Thanks



I am sure P123 already know this:

But this study suggest that among charts used by humans Candlestick charts did better than line-charts: https://arxiv.org/pdf/1808.00418.pdf

On a more technical level, CNN (convolutional neural-networks) did not have adequate recall (0.73) according to the authors. LSTM (long short-term memory) had a much better recall at 0.97.

Recall is also called sensitivity. True positives/(True positives + False negative). IMHO, they should have told us the precision or positive predictive value (True Positives/(True positives + False Positives)) but I could not find it. Precision would tell you how much you can trust your labels but would not mind if you missed a few patterns. I can stand to miss a few patterns if the ones I am calling a certain pattern are correctly labelled.

General summary of the article: Not easy.

-Jim

Jim, we are learning too. Our data scientist never used his skillsets for finance so this paper is very useful.

Thanks for sharing.

wow, that sounds very, very good, thank you!!!

would be great if we could put in the chart pattern + fundamental data esp. earnings estimates (this quarter, next quarter, next year etc.)…
My best pattern that I trade discr. right now is a strong momentum stock that had a recent pullback and within the pullback earnings estimates get revised up hard…

https://qullamaggie.com/my-3-timeless-setups-that-have-made-me-tens-of-millions/

would be great if we can train the ai with chart patterns like this (above)

Andreas, my guess is those charts have too much information. See the section “PROBLEMS ENCOUNTERED” in the paper that Jrinne shared

Couldn’t anything on a chart easily be turned into quantitative data? Support is just a bunch of similar lows over different time periods, resistance is a bunch of similar highs over different time periods, a trend is just brownian motion with a drift, etc?

Philip,

Not after a serious look at the this in general or even the paper:

They may have tried that with: “The hard-coded recognizer”

But also I was initially puzzled by the success of LSTM. There is no visual coding for a LSTM net like there is in a 2D (dimensional) or 1D (used more for audio) CNN arcitechure. Or at least I have not encountered a LSTM used for visual information (but would not be surprised to find that I just have not been looking at this long enough).

My guess: the NN is finding what you are describing with its own internal…AI thing (not vision for sure and not the same language definitions). Just like a NN never really uses the word cat (or even “hair-ball”) unless it is instructed to print the word.

Guaranteed to not be totally accurate and happy for correction or expansion.

Edit: So maybe CNN and LSTM arcitechure can can be combined as suggested here Gentle introduction to CNN LSTM recurrent neural networks
with example Python code.

It does not look like the original paper does this however. It does not seem to use a CNN as a front-end to LSTM arcitechure. It seems the LSTM in the original paper is doing this without using a CNN’s “visual features extraction.”

Jim

Marco, in the initial implementation, how many examples of a pattern do you expect are needed to enable the machine learning to abstract and identify the general case of a pattern? Is the idea to generate charts with random start/end windows (maybe fixed time period?) and keep showing examples until we see something that looks like the pattern? Or show a larger chart and have us isolate a particular sub-portion as relevant? I’m sure you’re considering this, but I guess I’m trying to think about fair sampling so we are identifying a pattern without ability to have too much extraneous data surrounding it to influence the labeling. (Like seeing a stock crater following a favorable pattern might cause someone not to label an otherwise typical case of a pattern we’re trying to identify in unbiased way).

No idea how many labels will be needed. I guess it all depends on the pattern and chart used. The initial release will all be about making labeling as easy, fast, and error-free as possible. These are some of the key decisions we’re contemplating

  1. The starting point is the screener where you’d loosely code the pattern you are looking and select and appropriate starting universe.

  2. You will then be run the screen for several as-of dates and flip through the charts to label. You results should include a good distribution of your label values. If not the rules should be relaxed.

  3. The period and chart type are fixed constants for the model. For example “simplified weekly candlesticks” for 6 month period.

It’s all being discussed . Thanks for your feedback.

If I understand correctly, the basic idea is to train neural networks or some form of AI to recognize chart patterns and utilize them on a time series to identify price patterns or generate trading signals. Do I understand correctly?

If that is the case then I would avoid this feature like the plague.

I worked with neural nets years ago and no longer do. They are the ultimate curve fitting tool, and we all know the dangers of fitting parameters to past data and assuming past patterns will somehow repeat themselves and thus have some predictive value. If I’ve learned nothing else in 40 years of investing it’s that markets are simply too chaotic and there is too much randomness in them for something like this to work. Maybe you could have limited success with a broad market index but I would never bet my portfolio on it. I learned a LONG time ago that you can’t do this successfully with individual stocks. Once you get a system or neural network trained on, say, AAPL, it may well fall completely to pieces on AMZN. AI and neural nets may have value with phenomena which are predictable and repetitive, such as weather forecasting (?) and are not so random and chaotic. There is no human involvement to speak of in weather phenomena, but markets are run by humans and that’s what makes them so chaotic. Who knows what those crazy humans will do next?

My brother is an engineer and uses an AUM financial advisor. She showed him a Monte Carlo simulation and he was dazzled by it. He thinks it’s a crystal ball. I snicker to myself whenever he talks about it. He thinks it’s deus ex machina.

Sorry to rain on your parade, but if my understanding of this undertaking is correct, then I view it as a fool’s errand.

With P123’s vast historical database, I think a more productive avenue of pursuit would be to use AI to determine if correlations exist between fundamental data and future price appreciation. For example is it universally true that earnings, valuation, market cap, etc. are dependable determinants of price appreciation? Or maybe they aren’t and have no place in making investment decisions.

You asked for feedback and there is my 2 cents’ sorth.