You can now use negative values in FHist() to evaluate in the future

marco · November 25, 2020, 9:55pm

Dear All,

You can now use negative offsets in FHist for the weeksAgo parameter. This can be quite handy to, for example, generate data for AI/ML training. For example to calculate the 1 year future performance you can say

FHist(“close(0)/close(251)”,-52)
or
FHist(“Ret1Y%Chg”,-52)

If you are running this with an “as-of date” less than one year ago the above will result in N/A values since it will resolve to a date in the “real” future, which still remains unknown.

Next week we will add more functionality to APIs to easily generate ML/AI label and feature data as discussed in this thread

Thank you for your feedback.

Quantonomics · November 25, 2020, 10:34pm

Is it normal that this doesn’t return anything?

FHist(“BenchClose(0)*100”,-52)

InspectorSector · November 26, 2020, 2:42am

How are you testing it?

Quantonomics · November 26, 2020, 2:57am

You guys keep saying we should use excess return over the benchmark for forecasting so my idea was to make something like this in the screener:

FHist(“(Close(0)/Close(21)-1)-(Close(0,GetSeries(”$SP500"))/Close(21,GetSeries(“$SP500”))-1)*100",-4)

I made a custom formula named $XS and tested one component:

FHist(“BenchClose(0)*100”,-4)

Go to screener, go back at least 1 year earlier, type $XS

FHist(“Close(0,GetSeries(”$SP500"))",-4)

It apppears blank

This alone works:
Close(0,GetSeries(“$SP500”))

InspectorSector · November 26, 2020, 3:05am

I can’t remember if FHist(“”,-1) had universal application or only good in Ranking Systems…

Quantonomics · November 26, 2020, 3:13am

@Marco is there something else I can use to look forward on the screener? It’s kinda hard to work only with the ranking system without seeing the values on the screen “on the side”.

marco · November 26, 2020, 3:28am

I think FHist doesn’t pick up the benchmark being used by the screener. Should be a quick fix. Sorry about that

But we’ll have much easier ways to do excess return (for past & future) so you don’t have to write such convoluted functions.

InspectorSector · November 26, 2020, 3:54pm

Marco - thanks for implementing this so fast.

I am wondering about the normal distribution ranking system. I don’t really understand the processing behind that as it appears there is still a 0-100 ranking score. Could you elaborate a little bit on what the algorithm actually is? It looks like it applies to all of the ranking nodes, one can’t be selective. Ideally I would like to apply some sort of different processing algorithm on the target node but keep the input nodes ranked normally if that makes any sense but maybe that isn’t possible with the algorithm.

marco · November 27, 2020, 3:40pm

Steve, the normal distribution ranking method is just a z-score translated to 0-100. If the data is normally distributed you should see the 0-100 scores also normally distributed. And yes, it applies to all the nodes. NOTE: this feature is something we have not touched in years for lack of interest.

The node score should be similar to what the ZScore() function returns but normalized to 0-100. Once we add the enhancements to the ranks API, so that you can specify “additional technical data”, we could perhaps also allow using any data as long as it’s used inside functions like ZScore or FRank. This way you can get your inputs in either percentile ranks or z-scores, and whatever other transformation functions we come up with.

InspectorSector · November 27, 2020, 4:10pm

Awesome - there is one thing I am just discovering and that is that items that resolve to NA fall to the bottom or middle rank score, depending on the method. Apart from FHist(" “,-1) which gives an NA by design, NAs have unintended consequences, specifically giving AI model outputs that are too good to be true. For example, if there are no analysts providing estimates for a company, then the sales estimate would come up as NA and the rank for the sales estimate would be zero. But it would be zero for the past and future. The AI model will of course home in on that and the result will look fantastic when it shouldn’t. So I am wondering if it would be possible to have an NA flag as part of the extra data. The NA flag would be set if there is a NA for any item for the row (set of node ranks for the stock on a specific date). It would be zero if there are no NAs. There is a second point here. FHist(” ",-1) will generate an NA of course and it will be processed into the rank score. But it would be nice if there were a second NA flag that only flags “look into the future” NAs. This will allow the discrimination of train versus prediction data sets. It may also come in useful to find inadvertent look ahead data in the rules, by looking at the generated output.