What are Neural Networks?
Neural Networks are a kind of computer algorithm, designed based on the human brain’s neurons. The same sort of ways our neurons are connected together, neural networks are too. Because of that, they can learn to do tasks, without needing preset instructions.
Layers of a Neural Network
Neural networks are made up of layers which are made up of nodes (which each represent a piece of information, and there is a variable number of each. They have an input layer, multiple “hidden” layers, and an output layer.
The input layer functions the way you would expect it to. It is the input for the whole neural network, where you feed in data, in the same way that visual information, for example is sent to the brain to be processed.
But the real meat (heh, get it?) of the neural network is in the hidden layers.
Hidden layers (called that because they are invisible to the user), are the layers that do the actual processing, for the neural network. These layers get information from the previous layer, multiply that information by a weight, and add a bias (which are both just regular numbers, represented by the lines for weights, and the nodes right before the output would be biases, if there are any).
One way to imagine how weights and biases work is by imagining a simple scenario. Imagine you are making a pot of noodles, and you gathered data points about the heat of the pot, how long you cooked it for, and how many times you stirred it.
Of course, the heat of the pot and the length of cooking are really important, meaning they have a high weight, a high contribution to how the noodles would turn out. Stirring is also important, but not as much as the temperature and cooking time, so it would have a lower contribution to the end product of the noodles.
As for biases they are just a small number meant to adjust the outputs of neural networks.
How do we choose weights and biases?
Trick question: we don’t. Instead the neural network can decide this by itself. How? It figures out its errors, or rather how far off it is from the desired result. It iterates over itself, trying to lower how far off it is, or its “cost”.
The equation for this is simple. Take the difference between the prediction from a neural network and the desired result, and square it, which looks like this (desiredResult - prediction)².
The neural network trains itself over and over to bring that number as close as possible to 0, by adjusting the weights and biases. Sometimes if you wanted the neural network to generate text for example, you don’t want it to reach 0, because that would actually mean it spits out text verbatim from its dataset. Sometimes you just want a low loss, so the text is coherent but still unique.
You may be wondering exactly how a neural network trains. Well, we use a method called backwards propagation, the math of which is very complicated and hard to understand.
Instead of trying to explain it, I’ll explain the theory of what it is doing, and if you are interested in the math, I’ll be linking an article explaining it below.
Pretty much, when you run a neural network you are doing something called forward propagation. It is information propagating forwards through the network, from the input to the output.
You can see where I’m going with this. Backwards propagation is the same thing but going backwards, and adjusting weights appropriately. Pretty much, it’s like taking an output and seeing how it compares to the input.
Types of Neural Networks
There are different types of neural networks, which use different arrangements of layers, as well as different types of nodes which handle information differently. There are a lot of them, but I’ll be talking about three of the main ones.
Feed-forward Neural Networks (FNNs)
These neural networks are the basis for pretty much all neural networks, and the others are more or less special versions of these. These neural networks specifically try to approximate a function. It defines a mapping y = f(x;θ), and figures out what values result in the best approximation.
Convolutional Neural Networks (CNNs)
These neural networks are exceptionally good at working with images, especially classifying them. It is in-depth, but pretty much, they run a sort of filter over an image (essentially just going pixel by pixel through the image, and overlaying filters meant to detect things from curves and edges to whole objects), and try to use that information to figure out what an image depicts.
Long Short Term Memory Neural Networks (LSTMs)
These neural networks are not made to work on a certain thing, the same way a CNN works on images mainly, but rather to accomplish something special. They can remember information, which is very useful in things like detecting and understanding speech, and anything else where context is useful. Essentially, they are looping back on themselves, using old information in new calculations, and throwing irrelevant information.
There is a lot more math in neural networks I didn’t explain here, from calculus to matrix multiplication and the sort. Of course, it is not necessary to understand that math if you are only concerned about getting into artificial intelligence. But if you really want to understand the math behind a neural network, or are taking on a complicated project, check out this article:
If you enjoyed reading this article or have any suggestions or questions, let me know by leaving a comment below. You can find me on LinkedIn for my latest updates, or check out my latest projects on my website. See what I’m up to on my newsletter. Thanks for reading!