Skip to main content

Understanding Generative Adverserial Networks - Part 1

This is a two part series on understanding Generative Adversarial Networks (GANs). This part deals with the conceptual understanding of GANs. In the second part we will try to understand the mathematics behind GANs.

Generative networks have been in use for quite a while now. And so have discriminative networks. But only in 2014 did someone get the brilliant idea of using them together. These are the generative adversarial networks. This kind of deep learning model was invented by Ian Goodfellow. When we work with data already labelled, it’s called supervised learning. It’s much easier compared to unsupervised learning, which has no predefined labels, making the task more vague. 

"Generative Adversarial Networks is the most interesting idea in the last ten years in Machine Learning." - Yann LeCun

In this post, we’ll discuss what GANs are and how they work, at a higher , more abstract level. Since 2014, many variations of the traditional GAN have come out, but the underlying concept remains pretty much the same. The applications of GANs are tremendous. From generating realistic images to rendering vast 3D environments, Generative adversarial networks, or GANs, are a type of deep learning model. A ‘model’ comprises of several layers of neurons, hence the term “deep” in deep learning.

First, we’ll dissect the name GAN for a slightly better understanding.

Generative: The technical definition is this:
A generative model is  a statistical model on the joint distribution of X and y.  This can be abstractly represented by: P(X, y) where X is the input and y is the target.

Adversarial: This is basically anything that involves conflict, and thus this word fits perfectly in this context.

Network: This of course refers to the fact that this model comprises of two neural networks, each consisting of several hidden layers.

The two main components of a GAN are the generator and the discriminator. The technical definition of a discriminative model is mentioned below.

Discriminative: This a statistical model which gives a conditional probability of a target y given when given a particular input x from a distribution X, that is: P(y | X = x).
What this basically means is that when the discriminator is given something as an input, it outputs the chance of that particular input being real. 1 means that it’s confidently real while 0 means that the discriminator is completely sure it’s fake.

We’ll now discuss the underlying concept behind GANs with an example. The generator is akin to a currency forger, while the discriminator is like a police officer.

When training of a GAN starts, the ‘forger’ is new to the job and the ‘officer’ (the discriminator) can easily distinguish between authentic and counterfeit currency. The officer also tells the forger what the difference between the two money samples was. Based on this, the forger tries to improve his method of generating fake currency.  

This process continues, with the officer (discriminator) getting better at telling fake money from authentic money and the forger (generator) getting better at fooling the officer.

Ultimately, a time comes when the chance of guessing (the officer can no longer confidently spot a difference) the fake from the real is exactly 0.5 -- a random guess. After training, any money the forger creates is exactly like real currency. This is the end goal of training a GAN. Consider the graphs below:

The dotted distribution is the real money (input data), represented by pdata

The green distribution represents the money generated by the forger (generated data), represented by pg.

Finally, the dotted purple curve is the officer’s guess of how real the currency is,  represented by pd. Likewise, the noise distribution, from which the generator gets input is represented by pz.

Studying the pd curve in the graphs, you can intuitively tell from (a) that the chance of the data being real is very high near pdataas its represents real money. It goes down near pgas the discriminator knows the data is fake or generated.

As we progress through the training, we notice the green distribution (pg) getting closer to the black distribution (pdata).

Ultimately, they coincide, and the discriminator has a 50% chance of guessing correctly. Notice how the purple dotted line (pd) is constant halfway through the curve

       D(G(z)) = 0.5

In a broad sense,  DG*(x), the ideal value of D(G(z)), when z = x , can be represented as shown below:
In figure (d), when both the pg and pdata distributions coincide, we can say that: pg = pdata

Hence, DG*(x) = pdata2 Pdata = 0.5 

The input to the discriminator can be a real image or a fake (generated) image. The generator is passed noise of some distribution like gaussian or even uniform. The generator generates random images that look like the real data from this noise. The discriminator outputs the probability of that input being real.

The generator and discriminator play what is known as a Minimax game in Game theory. The goal of one player is to minimize the final ‘score’ of the game, called the “minimizer”. The goal of the “maximizer” is to maximize the possible score, given that both players play optimally. The Minimax Algorithm is often used in two player games like tic tac toe and chess.

Here, the discriminator is the ‘maximizer’ and is trying to maximize the chance of it guessing correctly. By correctly, what I mean is that it should correctly label an image from the input dataset as a real image and a generated image as a fake one . The generator, on the other hand, is trying to make the discriminator incorrectly guess the generated images as real images.

I feel strongly that unless you understand a concept that is largely based on mathematics, to the last variable, its very difficult to build on that concept. We will cover the mathematical aspect of GANs in the next part.

Aniruddha Karajgi
Research Intern,
Cere Labs


  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. machine learning projects for final year In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.

    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

  2. Nice content and interesting blog. join 360digitmg for the Artificial Intelligence Training Course.

  3. I admire this article for the well-researched content and excellent wording. I got so involved in this material that I couldn’t stop reading. I am impressed with your work and skill. Thank you so much.Internship Program for International Students

  4. You have provided valuable data for us. It is great and informative for everyone. Keep posting always. I am very thankful to you.App Development Services Provider in India


Post a Comment

Popular posts from this blog

Anomaly Detection based on Prediction - A Step Closer to General Artificial Intelligence

Anomaly detection refers to the problem of finding patterns that do not conform to expected behavior [1]. In the last article "Understanding Neocortex to Create Intelligence" , we explored how applications based on the workings of neocortex create intelligence. Pattern recognition along with prediction makes human brains the ultimate intelligent machines. Prediction help humans to detect anomalies in the environment. Before every action is taken, neocortex predicts the outcome. If there is a deviation from the expected outcome, neocortex detects anomalies, and will take necessary steps to handle them. A system which claims to be intelligent, should have anomaly detection in place. Recent findings using research on neocortex have made it possible to create applications that does anomaly detection. Numenta’s NuPIC using Hierarchical Temporal Memory (HTM) framework is able to do inference and prediction, and hence anomaly detection. HTM accurately predicts anomalies in real

Implement XOR in Tensorflow

XOR is considered as the 'Hello World' of Neural Networks. It seems like the best problem to try your first TensorFlow program. Tensorflow makes it easy to build a neural network with few tweaks. All you have to do is make a graph and you have a neural network that learns the XOR function. Why XOR? Well, XOR is the reason why backpropogation was invented in the first place. A single layer perceptron although quite successful in learning the AND and OR functions, can't learn XOR (Table 1) as it is just a linear classifier, and XOR is a linearly inseparable pattern (Figure 1). Thus the single layer perceptron goes into a panic mode while learning XOR – it can't just do that.  Deep Propogation algorithm comes for the rescue. It learns an XOR by adding two lines L1 and L2 (Figure 2). This post assumes you know how the backpropogation algorithm works. Following are the steps to implement the ne