Trader's Journey into Data Science


From Derivatives Trading to Machine Learning

Shortcomings of Using a CNN for Time Series

In a previous post I detailed how I used Convolutional Neural Networks on stock chart images to predict future stock price movement. I would not recommend anyone risk real money using the process that I posted to my Github page but it was successful as a proof of concept. I was able to get statistically significant results in predicting both the direction and magnitude of the stock’s subsequent movement but frankly it left a lot to be desired. The process was very expensive and time consuming and there wasn’t much that I could do at the end of it to determine how useful a prediction was. I went in search of a different method for analyzing time series data and found a technique called Dynamic Time Warping (DTW). DTW is not new or particularly sophisticated but it is different enough than what I did before that it is my next step in the process of accurately predicting stock movements. I think that it is most easily explained as the more sophisticated cousin of correlation that allows for “flexibility” in evaluating the similarity of a curve.


Research Paper Review: CNN failings and the CoordConv Solution

In December 2018 a group of data scientists from Uber AI Labs published a paper called “An intriguing failing of convolutional neural networks and the CoordConv solution”. In this paper they identify a set of seemingly simple tasks that include plotting a pixel in space given its coordinates and the reverse of that, identifying coordinates based on an input image.
It is surprising but standard CNNs performed very poorly and took a long time to do so. Models were trained on up to a million parameters for as long as 90 minutes but still the best model only acheived 83% accuracy. In order to dive deeper into this problem, the authors visualized the area around the target pixel. We see that in the training set, the prediction is fairly confident as shown by the mostly black pixels with a distinct white one. The middle row shows a test example that is barely correct and has a number of pixels that have a similar grey color, indicting a low level of confidence in the prediction. In the bottom row, we see a test example that was wrong, displayed by the red box not being around the whitest pixel. The solution proposed by the authors is a CoordConv layer which adds two channels, one for the i coordinates and one for the j coordinates. In the image below, we can see that adding this layer solves the problem very well and very quickly. The model acheives perfect accuracy after less than 20 seconds of training compared to the original model which acheieved only 83% accuracy after over an hour. Revisiting the pixel visualization from above we see that the improved model has full confidence in all of its predictions.


Research Paper Review: Feature Fusion Model

In February of 2019, Taewook Kim and Ha Young Kim published a paper titled “Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data”. The basic goal of the paper was to use a combined model that outperformed each individual model as well as the benchmark of the S&P 500 index. The data used for this paper was the open, high, low and close prices on minute-by-minute SPY (SPDR S&P 500 ETF) ticker data from October 14, 2016 to October 16, 2017. The data was split into training, validation and testing sets in the following way in order to avoid using the future to predict the future. For each individual prediction, the previous 30 minutes of data is used to make a prediction 5 minutes into the future as shown below. The first of the two models that was tested before fusion was a Convolutional Neural Network on four types of stock charts to determine the best image type. The combination of (a) and (d) was found to be the best image to use and resulted in input images that looked like (a) shown below. The second model used was the LSTM for time series data and used price and volume data as inputs to create a predicted price output as shown below.


Challenging Edge Cases in NLP

In my capstone project I explored the use of Natural Language Processing applied to Twitter in attempt to gain insight into stock price movement. Ultimately I found that there is use in this technique but like most things, it needs more exploration before it can be confidently deployed. I had success using both the TextBlob and Vader algorithms but when examining a series of individual Tweets and the scores that the algorithms gave them, I found significant room for improvement. Vader in particular is very good at determining sentiment on normal types of tweets because it is specifically tuned for social media. However when applied specifically to stock market related Tweets, it was quite lacking. The fact that I was able to get positive results given how wrong Vader often was is somewhat of a miracle and makes me even more inclined to dig deeper into this topic. Vader’s shortcomings fall into two main categories. The first is that there a large amount of words and phrases in finance which have no meaning in normal language that when applied to markets have very strong meanings. The second is that traders as a population are remarkably cynical, snarky and rely heavily on what can best be described as inside jokes.


Forecasting Stock Movements with Image Classification

My goal of this project was to create a Convolutional Neural Network (CNN) that could predict stock movement simply by looking at a stock chart. The charts that I used are called “candlestick” charts in the financial world and display the open, high, low and closing price for each day. I chose this style of chart because it contains the most data of any of the conventional stock chart image types and has been show in other published literature to be the best for this type of model. I also added a 20 period moving average to the chart to help the model visually determine the trend of a stock. Every chart was of the preceding 100 trading days and the goal was to predict 10 trading days into the future. A chart was generated every 5 trading days from now back to January 2010 for all 30 stocks that currently make up the Dow Jones Industrial Index.