2008.06.11

I was chatting with Jon about the application of neural networks to stock trading, which is basically a perfect example for explaining the science. It went something like a'this:

> Okay. You want to tell me more about this program since I don't read code?

Think if you had three inputs to determine the price of GOOG tomorrow, all doing math by hand. One input is yesterday's price, another is Joe Expert's opinion, and the third is the number of internet subscribers on the planet.

A simple way to create a prediction would be to assign a weight to each, maybe 90% to yesterday's price since the stock probably won't swing much in one day. Then 7% to Joe's prediction since he's pretty good but not perfect. Then 3% to internet subscribers, since more users means more clicks on Google links which means more revenue. (For the moment ignore how the weights compute to actual dollar values).


A simple neural network. The nodes on the left are inputs - that data for any given day. The node on the right is the output, its value is a combination of the lefthand values and their respective weightings. The means by which they are combined is application-dependent.
How do you know your weighting is good? Try applying it to historical data, of course. See if this mix of weights worked on April 5 2008. How about August 12-16 2007? Maybe you find the actual result is coming closer to what Joe Expert predicted at the time, rather than the previous day's price, so you decide to up his weight to 15%.

It's a lot of work, you're talking about divying 100% among three factors, and determining how accurate each combination is for n days of data. It's a tough problem, but buy using fast computers and the beauty of random data sampling, it can be done.

And there's no cure-all formula; Joe Expert isn't 7% right or 15% right all the time. Sometimes you want to value his opinion more. Sometimes less. If the stock has been flat recently, you probably want to value the previous day's number more. If it's jumping around Joe probably knows they're about to release Google _______ and it's better to listen to him.

To handle these complexities, neural networks add a second layer of weighting so that each input is weighted against every other input, then these products are, in turn, weighted to produce the final outcome. It's somewhat magical, but basically you're letting each factor interact with every other factor to determine the best set of weights at any given time. That allows it to sometimes value Joe's opinion and sometimes not, depending on how right he's been in the past, under similar circumstances.


An archetypal three-input neural network. Each line assigns a unique weight to the value on the left and propagates it to the node on the right.
If applying your 90%/7%/3% distribution to historical data seemed like a lot of work, this kind of thing would keep you busy forever. But don't worry, it would keep a computer busy forever too. So human intelligence has to give it a leg up:

A. Tell the network what factors to consider; historical price, number of internet users, MSFT price, etc. It will find the best weight for each; if you were to also feed it the price of tea in China, the corresponding weights would quickly drop to zero.

B. Tell the network how to know it's getting close to the right answer (the correlation between how accurate it is and how you adjust the weights for the next try).

It won't be reliably accurate to a cent or even a dollar, since the problem domain is insanely complex (thus everyone is not a billionaire). Luckily the end result is reduced to buy/sell/hold so it doesn't have to be perfect to be useful.

I wrote a very simple network to predict GOOG share price based on the last year of trading. Instead of three input nodes there are thirty, each corresponding to the stock price on that prior day. After naively jiggling the weights around and keeping track of the best set, I ran the algorithm on the training data.


This is basically a measure of how well the network remembers the information it was trained on. If it trained forever it could exactly reproduce the training data, but that's no fun.

Whether or not the algorithm can be used to predict future share price is depends on a few things. Neural networks are especially good at recognizing patterns amid piles of unevaluated data. So it will be very strong when history repeats - even if very subtly. But a great supplement would be the relations it can find between other data (indexes, competing stocks, oil prices) as well as evaluating market predictions.


Neural network weights. Each value corresponds to the weight given to a particular day prior to the day that is being predicted.
As an aside, here is the worst Scrabble board ever (thanks Jon):