Forecasting Stock Movements with Image Classification - Trader's Journey into Data Science

My goal of this project was to create a Convolutional Neural Network (CNN) that could predict stock movement simply by looking at a stock chart. The charts that I used are called “candlestick” charts in the financial world and display the open, high, low and closing price for each day. I chose this style of chart because it contains the most data of any of the conventional stock chart image types and has been show in other published literature to be the best for this type of model. I also added a 20 period moving average to the chart to help the model visually determine the trend of a stock. Every chart was of the preceding 100 trading days and the goal was to predict 10 trading days into the future. A chart was generated every 5 trading days from now back to January 2010 for all 30 stocks that currently make up the Dow Jones Industrial Index.

Through a process of trial and error while leaning on relevant research papers, I found that determining the proper architecture for the neural network was a unique challenge. My failed models fall into three basic categories. The first is binary classification models that gave the same prediction for every test case. It would be wishful of me to say that I currently grasp exactly why this happened but I can surmise that the model was unable to learn anything through the training process and therefor just gave the one answer that minimized loss over the entire population of test examples. The second category of failed models were the multi-categorical classification models that gave confidence predictions that exactly mirrored the distribution of the labels from the training set. I believe that a similar dynamic is at play here in that the model didn’t actually learn much and just output the distribution of the training output. The third category of failures was my numerous attempts to make the output into a regression style that would give a continuous number. This failed miserably and I could not get any model to give a distribution of results that was anywhere near useful. The model would essentially pick a numerical return and then make relatively tiny deviations from that value in the outputs. It became obvious that a regression model was not yielding fruit and so I abandoned it rather quickly.

I am pleasantly surprised by the usefulness of both types of classifiers and their validation of the value that might lie in further exploration of these methods. The application of the binary classification results are pretty straightforward and easy to implement; basically just buy the recommended stocks instead of the broad market. The implications of the multi-categorical classification model is more complicated and nuanced but ultimately could be more valuable if implemented by someone with a deep knowledge of options trading. Options are basically leveraged bets on an underlying financial instrument such as a stock. Options allow you to say “I bet that Apple stock will be $5 higher in 10 days” and instead of making $5, you could make multiples of the initial risk. The reduction of the standard deviation of error allows a trader to take trades with the odds slightly in his favor and depending on exactly what the prediction is, potentially make large profits on small risks; but this is only if he knows how to set the trades up correctly. With all of the potential benefits of trading with this kind of model, I cannot stress enough that it would be extremely foolish to risk real money based on these models. These models are an interesting proof of concept but have many shortcomings that must be addressed in further exploration. I say that very confidently as a professional options trader with years of experience.