We are sorry - we can’t find the page you are looking for.
×
The page you were looking for may no longer be available or may not be available in your country, language or to your investor type. Please use the website navigation or site search at the top of the page to find content similar to what you were looking for.
Minds and Machines in Conversation: Learning Equities
Video content has been blocked in accordance with your cookie settings. You can access this feature by accepting all cookies or adjusting your cookie settings below.
Yimou Andrew Li: So, yes, Marija and I are going to present to you our ongoing new research minds and machines, learning equities where we put together and explain investment ideas on S&P 500 stocks that are coming out of a group of machine learning models. And our team of equity strategists and I will be the voice of machines. I hope I sound a little better than Alexa or Siri and Marija. We will share reviews from the strategies and and put it all together. So a few years ago, the concept of machine learning became quite fashionable. And when it comes to applying machine learning to finance or investing, I observe a spectrum of viewpoints. On the one end are practitioners who have little confidence in a black box model. They rather prefer relying on human experience in judgments. On the other end are people who are very enthusiastic about the computational power of AI and machine learning, and they believe that machines should and probably will take over the task of investing. So where do we stand? Well, we don't think it's one versus the other. We want to strike a balance. We want the best of both. And we believe if we can have a tool that interpret that can interpret machine learning models that can help us understand what is going on in the black box, to open the black box a little bit and see what's under the hood. We'll be able to put together different thinking processes, decision making processes of the minds and the machines, and we believe that will broaden our investment strategies and deepen our understanding of the capital market.
Yimou Andrew Li: So we developed a model interpretation tool called my model Fingerprint, and we developed this research framework in mind and machine. We first put it into practice two years ago by looking by having three machine learning models and our team of macro strategists to come up with investment ideas and views on G10 currencies. We use the model fingerprint framework to explain what is going on, why a machine arrives at a particular prediction, and we synthesize those investment ideas and we publish them on our insights platform. What we're doing now is just to expand that research framework to equities. This is actually our first time to share externally how we are doing it. And we're going to give you a sneak peek at what the mines and machines are saying about S&P 500 stocks right now. So let me first introduce our team of machines. We use the 3D machine learning algorithms to look at G10 currencies, and these are the algorithms we'll be using for analyzing stocks as well. So we have random forest, Buffett Trees and neural network. And based on two years of observations and our model interpretability tool, we've come to better understand how each of those machines behave. And just like a team of human analysts where each person has his or her own personality, when we look at these group of machine learning models, although we give them the exact same dataset, there are a lot of similarities in them, but they also behave slightly differently.
Yimou Andrew Li: So each machine has its own personality or machine analogy, if you will, Random Forest X Based on the principle of the wisdom of the crowd, it aggregates predictions made by individual decision trees, and each individual decision tree only looks at a segment of the of the data set. So to us, that's kind of akin to a bottom up analyst. And the tree is also made up of many small decision trees, but it runs them in sequence and puts a greater emphasis on error correction. So each individual decision tree would attempt to improve on the prior one until the best validation fit is achieved. So as an analogy boost to us, if it were an analyst, it's like a real hard worker that learns from mistakes and neural net attempts to link the inputs together in one network of connections. And you know, this may sound heroic effort, but doing so does come at a risk of using spurious relationships to find the bigger picture. So the emphasis there is really a big picture to us is sort of analogous to a top down analyst that attempts to explain everything with an overarching theme. So how do we come to understand those? Well, we use our model fingerprint framework with which some of you may have already heard. It is a model interpretability tool. What it does is to decompose a machine learning prediction.
Yimou Andrew Li: And I want to emphasize it is really important to to have a model interpretability tool when it comes to applying machine learning. A lot of our industry really runs on trust, and the lack of trust and transparency is really a key challenge in applying machine learning to investing. And we believe the machines really have a lot to say. The key is to figure out how to listen to them. So I will give you an overview of the model fingerprint framework and you can find articles we publish on the Journal. We published two papers on the Journal of Financial Data Science that apply this framework. I'll just give you a overview. So at its core, what we want to achieve, we want to build beta coefficient like intuition. For any machine learning model. It could be complex, it could be simple. We don't even need to know how or whether a model can be written in a mathematical formula. We we simply ask the question, you know, given, given the machine learning model holding our constant how do model predictions change as as I change variable or input or predict or x one. So that's based on or expansion of of the concept of partial dependence that George Freeman invented in his 2001 paper. And at its core is we first isolate one input variable at a time and say my predictor x one has value of negative one holding constant. What is the average model prediction and a trace out at that point and then move to the next value when predictor x one has a value of say, negative 0.8. What would be the model prediction on average holding all else constant so I can move along the horizontal axis and trace out a curve. And if your model is a linear regression model, what we will get is just a straight line with the slope being equal to the beta coefficient. If it's something else, well, maybe you'll get something else. Could be like a shaped relationship as illustrated in this chart on. But essentially what this does is of the partial dependence curve that we get. It is a faithful reflection of how the model uses this input variable, how the model thinks about this predictor. So once we get that partial dependence curve, we can do some cool decomposition as to to decompose a overall modal prediction into more understandable, intuitive subcomponents. We can do a linear we can do a decomposition to get linear effect by by fitting a best fit line through the partial dependence curve and capture the variation in prediction that that is modeled by that linear best fit. Intuitively, what this does is let's say you have a complicated model, but your understanding is limited to linearity or you just care about the first order relationship and the linear subcomponent will will give you that. We can go a step further and decompose and get the second.
Yimou Andrew Li: The nonlinear effect, which is the variation in prediction captured between the difference, captured by the difference between the partial dependence curve and the best fit line. We can go one step further. Instead of looking at one predictor at a time, we can isolate a pair of predictors and traverse the grid of their values and capture what, what, what is the variation in in model prediction as I go along the value combinations and of course, with careful computations and transformations in place to avoid double counting. And we can zoom in a little bit, all the linear non-linear interaction effects are speaking on the level of overall model average average across the training data. If we look at one single observations, one single prediction, we can use the partial dependence curve to make attributions locally to figure out how much of the model prediction can be attributed to each predictor in each interaction effect. So how do we use the model fingerprint framework? Let me give you an example from mining machine effects, which is the G10 strategy piece that we've been publishing for a little more than two years now. Well, first, we can look at a machine learning model and we can figure out how it thinks about predictors. Here I'm using the the neural network model that we train for mind machine effects. And you can see, for example, we can use a metric like mean absolute deviation to to to recapture how much, how much variation or how much impact a predictor has in the linear, nonlinear and pairwise interaction space. For example, interest rate differential is considered by our neural network model to be a very important factor when it comes to G10 currency pair for our return prediction. And it's both important. It's important both in the linear space as well as in the non-linear space flows. Differential, on the other hand, is also a key driving factor by the model. The newer model thinks most of its effect is already accounted for in the linear space. There is not much nonlinear effect. And if you look at, for example, pairwise interaction effect, there is a lot of interaction. The neural thinks there is a lot of interaction between the interest rate differential and currency market turbulence. And we called it mortal fingerprint because, you know, it is really unique if you use a metric like absolute deviation or something else, it's really unique to each model. And, you know, the framework is model agnostic and pretty computationally efficient to implement. So if you put a chart like this of different models side by side, it also is also very easy to distinguish one from the other. We can take one more step further. We can zoom in a little bit and look at one particular or look at a particular prediction and ask how each predictor contributes to it. Here in the chart to the left, I'm using the top pick that Nero came up with for the month of October.
Yimou Andrew Li: That was happened to be long British short era and we can see how much what's what the percentage of of predicted return can be explained by each individual predictor and each individual interaction effect. For example, for this month, the 12 month differential factor is really a key driving factor, and there are some pretty significant interaction effects between spot return trend and differential factor as well as between trend and holdings differential. Now, if we sort of zoom out a little bit and look at the whole G10 Universe, we synthesize those ideas coming from the machines because now we have this model fingerprint framework that helps us understand how these models arrive at those predictions. And we ask our team of macro strategists to explain their thinking process, to explain their investment ideas and put them together and publish this mining machine piece on our insights platform. So I'll share some, I think, interesting observations from running mind machine effects. Well, first of all, in the chart to the left and showing the sort of out of the sample performance so far since the inception in July 2020. And you can see already, you know, again, we feed in the data that we feed into, those machines are exactly the same. But because of the differences in how those algorithms are designed, those models behave would behave differently. They would model slightly different, they would model different relationships. And you can see that will translate to out of sample performance as well. And for us is really our star machine. And what really caught our eyes was in into 22 or in the period between June 2020 and June 2021, when, you know, it was really in general a bearish a bear market for US dollars for US dollar forests made some bold calls ta ta ta ta ta long the US US dollar and forest made for long USD calls during that period and turns out those are the exact four months when us all also outperformed major G10 currencies between again June 2020 and June 2021. And we can with the model interpretability tool, we can figure out why. And it turns out it's one month across the differential factor, a 12 month actually differential factor and interactions between differential and currency market turbulence that drove those predictions. And I mention you know forest made bold bullish USD picks. It's really not bold because you know it seems I'm suggesting the machines have some emotions and they really don't they really only look at data and that could be an edge or opportunity that machine learning models can bring to us. You know, human analyst is kind of an inevitable, inevitable, I would say, to have personal biases or, you know, when when the market has a prevailing trend, you could call it brave predictions. But, you know, for machines, they don't really care that what they do is really only look at the data. Well, there are periods where the machines are not performing so strongly.
Yimou Andrew Li: I guess just like like like humans, you know, there is no perfect human and there is no perfect machine. There are periods where the machines would make predictions. And and in 2022, that has been the case a few times when our machines and forests included have been making calls to to to long the Japanese yen, for example. But we don't have to be intimidated by that because we have this tool to open the black box to see what's under the hood and can help us understand again how models arrive at those predictions. So what we see is year to date in 2022, valuation and trend reversal has have increasingly become the driving key driving factors for those machines. And it's it's understandable if you look at data in the past ten years, basically between 2010 and 2021, those factors or those strategies, value and trend reversal are pretty good when turbulence is high. And here today, 2022, currency market turbulence has been persistently high. But and that's sort of explains why our machines arrive at long Japanese yen calls. However, Japanese yen pretty much lost its safe haven status in this year and has been underperforming. So again, when a model makes a prediction, it could be something as a little unintuitive. You don't know why, you don't know if it's genius or insanity. That's that's very risky. But if you can know why a model arrives at a particular prediction, you if you are able to look what's under the hood, then I think that will help us understand what is going on in the market and and broaden our investment strategies. Now here is how we set up the equity test, the equity analysis. So as a first step, we're looking at US large cap stocks. S&p 500 minus State Street, where including a list of fundamental factors size, value, earnings, revisions, zero. For example, we are including some proprietary indicators as well equity market turbulence, industry flows for now, and we organize data into a stock month panel and we calibrate those three machine learning algorithms, forest boosts and neuro. And we use model fingerprints to understand their predictions. In the chart to the right, I'm showing a model fingerprint from the forest model and how it looks at the ROA Factor. And you can see the forest really doesn't like companies that that that have very low ROE, but there is sort of a diminishing return as we go along the horizontal axis. And the average response would increase by at a diminishing rate. Here. I'm showing the long short test performance. So we train our models based on data between 2006 and end of 2018. And the charts here is if I long the top 50 stocks. In short, the bottom 50 stocks based on the predicted scores of those three machine learning models, how they would perform since January 2019. And I'm putting all our s there as not sufficient, not too sophisticated model comparison.
Yimou Andrew Li: And you can see there are some efficacy. And again, you can see even based on this exact same data set, different models would behave differently. So we had to prepare this slice a few weeks before the event. And we really want to show you some fresh picks and predictions made by the machines. So these are the monthly you know, these are the machine picks based on data as of October 14th. This is a couple of weeks ago. And you can see what are the top five picks and bottom five picks by those three machines. You know, there are similarities, but some differences as well. I think you can already see some sex retails in stiletto heels, which Marija will go into more detail later. And with the model fingerprint framework, you'll be able to see what the key driving factors are for each of those machine learning models. And I, I can report to you, you know, if you look at the top picks and bottom picks and the return of them from the opening October 17th to 4 p.m. yesterday, the top picks are outperforming the bottom picks by about 4%. But we'll see. You know, it's meant to be a monthly prediction. So there are still two weeks out and we'll see how they work. I guess with that, I will turn over to Marija and she will go into more details about the mining specs and machine picks and tie it all together.
Marija Veitmane: Thank you, Andre. So Andrew has so far showed you how machines are picking stocks. So I set myself two tasks in this talk. First of all, one simple to tell how mines pick stocks. That's the job. That's easy. But then I'll try to take model fingerprint technique and try to open up the black box machines. Ah, and translate their their language zeros and ones into language that stock-pickers equity portfolio managers understand. And we can draw some conclusions and have some interesting discussion about it. So with that, let's go to let's go to mine specs. And I have to say, I mean, despite introducing me as one Super Bowl, I have to say any top down analyst picking stocks now cannot help but be terrified when those that inflation remains high. We showed you that many times and we show we see very little sign core is going anywhere but up. And that means that financial conditions will remain will continue to tighten that terrible for equity multiples. We know that current earnings are still okay. And where we're kind of slightly difficult statement with tech reporting this week, but we know so far the earnings themselves are I mean, they're not great, but they okay. But we know they'll crack next next year. We hear that in corporate guidance loud and clear. The most terrifying thing for analysts right now is to see that investor investors are not prepared for that backdrop. It's not priced in. They're not positioned for it. And that's what we see from our proprietary indicators of investor holdings. So, I mean, I don't expect anything than a terrible year next year. So where do you hide? And I mean, I couldn't help I chose the most defensive portfolio I can think of large cap, low beta, high dividend yield and try to find, God help me some profitable stocks, difficult times this year. So I'll show you a few chart of how we arrive to this conclusion. First off, inflation. So I quite like to look at individual sectors in the US inflation and there are six lines. One, only one of them was going down until a couple of weeks ago. That's transportation. So we know energy prices were going a little bit slower. Thank you for draining SPR. Probably not going to get that again, but every other sector is accelerating. So, I mean, inflation is still high, particularly at core. What central bank is doing about it. We know there still remain very, very hawkish. And actually, to me, what's very interesting, every time market rallies, they get even more hawkish. We heard Christine Forbes talking about hawks turning to dragons. I mean, what else can you what do you want to hear? So we know multiples are going to they already compressed, they're going to compress more and they're going to stay low.
Marija Veitmane: We know that earnings, again, top left chart earnings are fine economy, U.S. economy is not in recession. Maybe it will will go soon. But right now, earnings are strong. Even margins are strong. They're coming down, but they still very, very high. So, so currently strong about to collapse. Look at the right hand side chart. That's analyst expectations for earnings growth towards the end of this year and next year. They still expecting earnings to grow. That's crazy. That's that will come down What investors are doing in this environment, they're doing exactly what you expect them to do. So this year, look at the that's changing an investor holdings in our custodial database they increased decreased allocation to stocks good they decreased allocation to bonds good. They went into cash. Fantastic. But they're going way, way too slowly. Well, those those charts show your historical perspective and the dotted lines are historical average. They started from high allocation to risk this year. It's gone down. They're reducing position by way too slowly. There is still plenty to come. And that's exactly the same what we see in equities. I think this chart is even even more informative for thinking about what factors you should be investing in next year. So dark blue bars on the left hand side chart show you investors allocation by style at the beginning of this year. And they started with and that's where my previous Superbowl nature comes in so yes we'll have quality. Yes, we'll have growth. We will we will have beta. We were underway. Dividend yield, light blue bars is what investors have done this year. They're selling rose, They're selling qualities, they're going defensives, they're buying dividends, value, large caps, bought a lot of energy. But again, the positions are not square they're still underway. Those styles same thing in terms of cyclical defensive allocation. Again huge allocation to cyclicals is being drawn down to neutral levels. Still more to come. So looking at mine speaks, I take S&P 500 universe. First thing I do delete State Street stock don't want to have compliance conversation and then I look what are the largest stocks lowest beta highest dividend yield and where there is still some profits and that's my top ten stocks. I reverse all those factors looking for small caps, high beta, low, no dividend and not many profitable companies on that on that side. That's my that's my bottom ten picks. I mean, being a strategist, I really like to aggregate it by sectors we like to adopt. So within the top decile in in both picks, so on the left of so on the left hand side you have more defensive type stocks, real estate, quite surprised about financials, but I mean they give you yields at large, but then utilities here are lots of health care stocks on the right hand side. What would do what mines really don't like, where we see lots of lots of trouble? Tech, industrials, consumer discretionary, actually quite happy with Texas work. So so that's kind of what minds think about stocks. So now the more difficult part. So I'm going to try to understand machine. So first thing I'll show you is that the average average coefficient for on each of eight variables we give machines in that in a top decimal. So what machines like and to me what, what is interesting is that like looking at some similarities between them. So machine chose low volatility with a negative sign. So which means that machines want to buy high volatility. You can already see some alarms bell going on they they going for reverse also they want to buy stocks that have recently underperformed so they expect that to change and negative loading on value means they want growth. So effectively machines are going for highly volatile growth stocks that have recently underperformed. It sounds risky to me, but machines are clever. So here I want to show you the top five picks from each each machine, what we've done there. Again, that's machine fingerprint. We trying to explain why machines pick what they pick. Actually quite, quite interesting. A couple of example. If you look at numerals, the bottom panel, the top pick is Apple and Microsoft.
Marija Veitmane: So you get I mean, obviously quite a lot of tech, but they're picking profitable tech. So you can see those kind of lighter like darkish blue panels at the very, very right. Which allowed on profitability. So machines care about profitability, They care about reversals of Apple. So why machine pick Apple, for example? So it has recently underperformed, it has fairly high loading on value and it's profitable and I think that's very important. Similarly, if you look at Netflix again and another tech stock or tech like stock, but again, more profitable one and again, it cares about profitability, it cares about value. And actually Netflix is a bit smaller stock. So so kind of reversal in size is quite interesting. We can do exactly the same thing on the downside. And again, I'm looking for similarities between machines. So here machines are selling unprofitable stocks. Great, I like that. But they still and selling value, that's risky and they're selling large caps and that's risky. So for example, like topic and Forest or Ventas, a real estate company what is the highest loading its its value so it's really want to buy value it has a bit of reversal it wants to buy large caps. So okay so we can understand that so we can understand what machines are picking. Aggregating machines picks into sector is really interesting. So, so I've done exactly the same what I've done for mine for charts before I calculated how many stocks or each machine in the top desk. File in each sector and machine picks are very punchy. So they like industrials, like consumer discretionary. They really underwrite a lot of defensive. And I thought it would be quite fun to try to give them like a cartoon character. And I'm obsessed with Norse mythology and my kids love Marvel, so I couldn't think of anything better than Thor, God of Thunder, enormously powerful risk taker with his big hammer doesn't pull any punches. If I look at my picks, I mean, that's Loki that's got God of mischief, cunning, strategic above all, play safe. So look at my pick. Some short tech or short industrial short consumer discretionary. Have lots of more defensive things. So I mean, hopefully this gives you an idea of what we can do. So Thor and Loki, I mean, the title of this presentation was Minds and Mission in Conversation. I mean, those two can definitely work together sometimes until there's obviously Ragnarok and it's time to fight to death, maybe bonus time or for retribution. What hopeful is that? I mean, that's slightly entertaining, but gives you an idea how we can work together, how we draw insights from each other and give you new and kind of very structured way of thinking about stock picking. I'll stop for that.
Lee Ferridge: Okay. Thank you both. You're very entertaining. We have a couple of minutes for questions, so I will start in in the audience. If anybody has any, please raise your hand. Thanks for the presentation. So just wondering on the blacklisting period that you use for the machine. So if I recall, 2016 was the starting point for the for the training for the machine and it was the lot in sample was more recent. So if you use 2016 for that end sample, it could be argued that you're picking probably the most abnormal period possible for factory returns in history because that's the really only period where value momentum both had a draw where they'd have a drawdown, right? So the so called quant crisis, right? So we can see it in your attribution where it favors reversal and growth for, for most of the model for most of the machine learning models. So how come you didn't back prior periods where value has performed much, much better in the past and chose 2016, which arguably is probably was probably the most abnormal part for factor returns?
Yimou Andrew Li: Well, we started in 2006. I may have misspoke. It's so the training data was between 2006 and end of 2018.
Lee Ferridge: Nonetheless, though, is still a period where you're capturing probably the only period in history where you did see growth as providing alpha.
Yimou Andrew Li: Well, the decision of what we didn't well, the decision of why we chose that period is because the data availability, I think we want industry flows and earnings revisions, good earnings revisions, data that we got that was the bottleneck really. We couldn't at least for now, go beyond two and six. I guess maybe something that could be useful or helpful to answer your question is, you know, the the machines. So we have regime variable like equity market turbulence and industry flow. Right now we're including the results we're showing where including industry flows as a regime variable. And what Marija and I talked about today, those are the predictions made on monthly data as of October 14. So they are looking at mid September to mid October data and use that to make predictions. And in that period, turbulence I think was relatively low compared to year to date. And I think that's why you see growth instead of value as a or the machines are favoring growth over value. And I think that really comes, if you look at the model fingerprint framework more closely, that comes from the interaction effects between the value and growth factor and the turbulence. So is not necessarily so. What we see is not the machines all the time would favor growth or all time favor value. And turbulence really plays a very important role in scaling or timing to some extent. While the machines would pick.
Lee Ferridge: Okay, we have time for one more. Marija, this is for you. So the mind picks sound very bearish. So welcome to the dark side. You really don't like airlines, do you? I noticed that on your pick. So good luck flying home. What other indicators are you looking at to become more optimistic?
Marija Veitmane: Yeah, I have to say I'm not enjoying being a buyer, but this stuff starts with that.
Lee Ferridge: Marija It's fun.
Marija Veitmane: I mean, thinks I'm looking at like from State Street. So there are two absolutely critical things. Is any progress on progress, any sign of deceleration in core inflation series? So sector inflation, sector price, start sector is super, super important that we track them very closely. And investor positioning, I mean, to me it is really, really scary how I mean, the market talks about I mean, with 25% down on MSCI World, we're already kind of lot priced in. No, it's not. Investors are not positioned for downside. There is still a lot of kind of risk that can be reduced in portfolios. So so until that happens, that that's very difficult to get more constructive. On the market side, I think labor market is probably the key indicator for us to get some hope of Fed being less aggressive but own State Street side sector series on price starts and investor positioning by far most important indicators.
Lee Ferridge: Marija and Andrew, thank you very much. Thank you.
State Street LIVE: Research Retreat offers a wide range of academic expertise and timely market insights.
Our monthly Mind and Machine FX publication has been out for two years. It has allowed us to establish a framework for our human strategists and machine learning models to understand each other and extract insights. We are expanding the framework to form views on equities. Marija Veitmane, senior strategist on the multi-asset class research team, and Yimou Andrew Li, quantitative researcher for the portfolio management research team, share their thoughts on rules and relationships learned from fundamental and proprietary equity indicators.