Beyond the Mean Field Approximation for Inference in Multi-Layer Perceptrons
Interest in Multi-Layer Perceptrons (MLPs) has spiked in recent years due to their central role in deep learning technologies. MLPs can be seen as a trainable multi-input multi-output function constructed by composing linear and non-linear functions organized into layers of nodes. A lesser known fact is that sigmoid-based MLPs also admit a probabilistic interpretation. Under this interpretation, the MLP forward-pass can be seen as approximating a marginalization of hidden node activations at each layer under the mean field assumption. In this talk we explore this probabilistic view of MLPs and propose a closed-form approximation beyond mean field. This new approximation takes into consideration the uncertainty of inference at each layer and is thus closer to the true marginalization. Furthermore, it is also shown that such an approximation provides improved performance in Automatic Speech Recognition experiments when used for feature extraction.