- Poster presentation
- Open Access
Multilayer perceptrons, Hopfield’s associative memories, and restricted Boltzmann machines
© Asakawa; licensee BioMed Central Ltd. 2014
- Published: 21 July 2014
- multilayer perceptrons
- Hopfield’s associative memories
- restricted Boltzmann machines
- one algorithm hypothesis
This study was intended to describe multilayer perceptrons (MLP), Hopfield’s associative memories (HAM), and restricted Boltzmann machines (RBM) from a unified point of view. Despite of mutual relation between three models, for example, RBMs have been utilizing to construct deeper architectures than shallower MLPs. The energy function in HAM is analogous to the Ising model in statistical mechanics, and it connects microscopic physics to thermodynamics. The canonical partition function Z in the Boltzmann distribution is also utilized RBMs. Asynchronous updating and contrastive divergence (CD) based upon Gibbs sampling is also related. Therefore, it seems to be worth considering these three models within a common framework. This attempt might lead to “one algorithm hypothesis.”, which insists that our brains might rule a single but universal rule.
Summary of MLP, HAM, and RBM
Within Layer Connection
No (EM Algorithm)
HAM and RBM have symmetrically weighted connections, w ij = w ji , although generalized Boltzmann machines can not satisfy this constraints. Similarly, there are no feedback connections in MLP in general. When we denote a connection weight from j-th unit to i-th unit as w ij , w ij ∈ R, w ji = 0 in MLP. When we consider a merged weight matrix W, all the models can be considered as identical.
The construction methods adopted by Deep Learning are based upon RBMs. One of key concepts to success for constructing multilayer deep architecture is the non–linearity, because units in hidden layer in RBMs are binary. The non–linearity seems to play an important role to construct deep architecture. When we suppose to abandon CD and binary feature, multilayer architecture might replace one weight matrix W = W1W2… Wp. Also, we can consider a thought experiment with only one hidden unit in RBM. If h = 0, then there are no meanings at all. If h = 1, then it must be an identity mapping, or at least, it might be extract the eigenvector vector corresponded to the maximum eigenvalue value in data matrix X. This might be equivalent to the algorithm proposed by Oja (1985). Since Deep Learning architecture network models trained via RBMs have no within layer connections, we might not be able to reject a possibility that a hidden unit hi might be trained to detect exactly the same features as another hidden unit hj. In order to avoid these situations, we must prepare a large number of binary hidden units more than the entropy being involved in input data set. RBM has no assumptions about within layer connections, it might success to detect important features among data matrix via CD. However, this constraint might weaken slightly, when we would introduce the EM algorithm to be estimate the states of latent variables, and an online algorithm of HAM. This might bring us to an idea “semi restricted Boltzmann machines.”
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.