Skip to main content

Cognition and Bayesian

There is a growing consensus that the brain uses Bayesian to perform cognition. Our brain is capable of learning using only positive examples, unlike the approach taken in machine learning where there is a need to provide both positive and negative examples. Consider an example where a parent says to a child “Look at that dog!” A child is capable of categorizing all future dogs it looks at from only one or two examples. The brain of that child is generalizing using some form of Bayesian inference. Welcome to the world of One Shot Learning.

The discovery that Bayes himself abandoned for unknown reasons, today stands at the forefront of making Artificial Intelligence a reality. Learning from few examples is what we are good at, and any intelligent machine is expected to do. Thanks to Pierre Simon Laplace who rediscovered it and gave Bayes' theorem a mathematical form, cognitive AI research uses Bayesian to make machines learn.

                                                               Fig: Bayes Theorem


The above formula calculates the probability of a hypothesis given new data. Any Artificial Intelligent System should use this fact to update its beliefs when new evidence arrives.

Eric G. Miller, Nicholos E. Matsakis and Paul A. Viola suggested that using probability density over the set of transforms may be shared by many classes, and demonstrated how using this density as “prior knowledge” , a classifier based on a single training example for each class can be developed. [1]

Li Fei-Fei, Rob Fergus and Pietro Perona presented a method for learning object categories from just a few images. In their Bayesian framework: Object categories are represented by probabilistic models, “prior” knowledge is represented as a probability density function on the parameters of these models. The “posterior” model for an object category is obtained by updating the prior in the light of one or more examples. They demonstrated this method on four diverse categories (human faces, airplanes, motorcycles, spotted cats). Three categories are first learnt from hundreds of training examples, and a “prior” is estimated from these. Then the model of the fourth category is learnt from 1 to 5 training examples, and is used to detect new exemplars a set of test images. [2]

Bayesian is used in many fields including animal learning, language processing and acquisition, visual scene perception and many more[3]. Leading universities have dedicated teams that are applying Bayesian techniques in cognition. Josh Tenenbaum and his Computational Cognitive Science group at MIT explores computational basis of many aspects of human cognition including learning concepts, judging similarity, learning word meanings and syntactic principles in natural language. They are betting on Bayesian techniques and their results closely matches with human subjects. The results of their research will have a huge impact in the world of Information Extraction, Virtual Assistants and robotics.

Generative models can be programmed in Church, a probabilistic programming language. Church makes it easy to develop generative models as it includes a novel language construct, the stochastic memoizer, which enables simple description of many complex non-parametric models. [4]

Although there is a debate as to whether the brain uses Bayesian, or whether Bayesian techniques closely mimics the cognitive learning features of the brain, applications of AI will use Bayesian, no matter which side wins.

References

[1] Miller, E.G.; Matsakis, N.E.; Viola, P.A., "Learning from one example through shared densities on transforms," in Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on , vol.1, no., pp.464-471 vol.1, 2000

[2] L. Fei-Fei, R. Fergus, and P. Perona, “A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories,” Proc. Ninth Int’l Conf. Computer Vision, pp. 1134-1141, Oct. 2003.

[3] Griffiths, T. L., Kemp, C., and Tenenbaum, J. B, "Bayesian models of cognition." In Ron Sun (ed.), Cambridge Handbook of Computational Cognitive Modeling. Cambridge University Press, (2008)

[4] Noah D. Goodman, Vikash K. Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua B. Tenenbaum, "Church: a language for generative models," Proc. Uncertainty in Artificial Intelligence (UAI), 2008.

Comments

Popular posts from this blog

Implement XOR in Tensorflow

XOR is considered as the 'Hello World' of Neural Networks. It seems like the best problem to try your first TensorFlow program.

Tensorflow makes it easy to build a neural network with few tweaks. All you have to do is make a graph and you have a neural network that learns the XOR function.

Why XOR? Well, XOR is the reason why backpropogation was invented in the first place. A single layer perceptron although quite successful in learning the AND and OR functions, can't learn XOR (Table 1) as it is just a linear classifier, and XOR is a linearly inseparable pattern (Figure 1). Thus the single layer perceptron goes into a panic mode while learning XOR – it can't just do that. 

Deep Propogation algorithm comes for the rescue. It learns an XOR by adding two lines L1 and L2 (Figure 2). This post assumes you know how the backpropogation algorithm works.



Following are the steps to implement the neural network in Figure 3 for XOR in Tensorflow:
1. Import necessary libraries
impo…

From Cats to Convolutional Neural Networks

Widely used in image recognition, Convolutional Neural Networks (CNNs) consist of multiple layers of neuron collection which look at small window of the input image, called receptive fields.
The history of Convolutional Neural Networks begins with a famous experiment “Receptive Fields of Single Neurons in the Cat’s Striate Cortex” conducted by Hubel and Wiesel. The experiment confirmed the long belief of neurobiologists and psychologists that the neurons in the brain act as feature detectors.
The first neural network model that drew inspiration from the hierarchy model of the visual nervous system proposed by Hubel and Wiesel was Neocognitron invented by Kunihiko Fukushima, and had the ability of performing unsupervised learning. Kunihiko Fukushima’s approach was commendable as it was the first neural network model having the capability of pattern recognition similar to human brain. The model gave a lot of insight and helped future understanding of the brain.
A successful advancement i…

Understanding Projection Pursuit Regression

The following article gives an overview of the paper "Projection Pursuit Regression” published by Friedman J. H and Stuetzle W. You will need basic background of Machine Learning and Regression before understanding this article. The algorithms and images are taken from the paper. (http://www.stat.washington.edu/courses/stat527/s13/readings/FriedmanStuetzle_JASA_1981.pdf
What is Regression? Regression is a machine learning technology used to predict a response variable given multiple predictor variables or features. The main distinction is that the response to be predicted is any real value and not just any class or cluster name. Hence though similar to Classification in terms of making a prediction, it is largely different given what it’s predicting. 
A simple to understand real world problem of regression would be predicting the sale price of a particular house based on it’s square footage, given that we have data of similar houses sold in that area in the past. The regression so…