Why deep nets are great
- If you know about multi-layer perceptrons learning
via back propagation of error, you are in pretty good shape to
understand deep nets.
- The basic problem from the 80s and MLPs is that you couldn't really
go beyond two levels of hidden layers.
- There were two reasons really: not enough data, and not enough
computing power.
- There was some thinking a few years back when deep nets started
to become popular that the problem was the learning algorithm.
In general, backprop still works.
- However, there are a lot of other good algorithms.
- It also helps if you train the system a layer at a time.
- What's really good about deep nets, is that there are a lot of
parameters to set using a learning mechanism.
- So, if you have 100 neurons in layer 1, and 100 neuron in layer
2, there are 10,000 connections (or parameters) to set.
- Each layer (of the same size) adds 10,000 new parameters.
- You of course do not have to use perceptrons. Restricted
Boltzmann Machines are popular, but you can use other things.
- A common technique is make a generative model between the input
and first layer, so that the system is more roboust. You can
use an auto-associator (like a Hebbian rule) for this.
- There's no restriction between the connectivity in levels,
the number of neurons in levels, or even the levels themselves (e.g. level 2
can split to level 3a and 3b, and then later combine into level 4.).
- If you are doing vision, you might want to have a 2D input field, with
spatially local connectivity.
- You can use different learning algorithms at different levels or even
different algorithms at different times on a given level.
- It's incredibly flexible, or perhaps incredibly underspecified.
- However, there is a lot of solid (published) mathematical theory
behind it.