This article takes a quick look at the daily SPX Sentiment Data (StockTwits). The motivation is purely curiosity. There isn’t enough history yet for me to start using data of this nature for building production models. The data can be downloaded for free on this page:
When you create a Quandl account you get 100 free previews (downloads). History begins on 5/14/2010 for this sentiment data series. There are a number of fields included in the download. I’m going to look at the Bullish Intensity minus the Bearish Intensity. I’ll refere to that as Net Intensity. A graph of the Net Intensity follows.
We can see the series isn’t strictly stationary. There are trends and the variance of the series appears as though it changes over time as one would expect. The data appears to be of decent quality based on a quick visual inspection.
The first thing I’ll do is use the Net Intensity as a trading signal and have a gander at the equity curve. In the graph below we see the result of trading the SP futures MOC with zero trading delay.
Counter trading the Net Intensity over the available history would have provided a slight edge. In the next graph we see the equity curve when applying a 1 trading day delay.
The slight edge that appeared to be there dissappears when applying a 1 trading day delay. That’s not surprising. My guess is that when the S&P 500 advances net positive tweets tend to result during the day. Vice versa for declining days. I could look at this more closely, but I don’t want to invest much time in something that I will not be using. The equity curve in the next graph is definitely not achievable. A future leak was created on purpose. The equity curve is the result of trading on the open using the Net Intensity published after the close.
If we had the Net Intensity for the day on the market open then we’de have an excellent trading signal. No surprises there.
Next step is to model the Net Intensity to see if predicting the series has potential. I’m going to use a K-Step Ahead SVM Predictor and apply a 1 trading day delay trading MOC. The settings I’ve used for the predictor appear below.
The Predict Bars parameter was changed from 3 to 2. Other than that, default values were used. Hypothetical trades will be based on the predicted change in the Net Intensity from the next trading days close to the one after that.
The above results are entirely out-of-sample. The equity curve and % of Perfect are decent enough. Unfortunately, there isn’t enough data available to draw any conclusions or do any serious analysis. I have a feeling that the bullishness / bearishness of the tweets is largely in reaction to the price action. If I’m correct then it should be possible to design a synthetic sentiment indicator that could then be used for some serious model building.