XOR is considered as
the 'Hello World' of Neural Networks. It seems like the best problem
to try your first TensorFlow program.
Tensorflow makes it
easy to build a neural network with few tweaks. All you have to do is
make a graph and you have a neural network that learns the XOR
function.
Why XOR? Well, XOR
is the reason why backpropogation was invented in the first place. A
single layer perceptron although quite successful in learning the AND
and OR functions, can't learn XOR (Table 1) as it is just a linear
classifier, and XOR is a linearly inseparable pattern (Figure 1).
Thus the single layer perceptron goes into a panic mode while
learning XOR – it can't just do that.
Deep Propogation
algorithm comes for the rescue. It learns an XOR by adding two lines
L1 and L2 (Figure 2). This post assumes you know how the
backpropogation algorithm works.
Following are the
steps to implement the neural network in Figure 3 for XOR in Tensorflow:
1. Import necessary libraries
import
tensorflow as tf
import
math
import
numpy as np
2. Declare the
number of input, hidden and output layer nodes.
INPUT_COUNT
= 2
OUTPUT_COUNT
= 2
HIDDEN_COUNT
= 2
LEARNING_RATE
= 0.4
MAX_STEPS
= 5000
3. Nodes are created
in Tensorflow using placeholders. Placeholders are values that we
will input when we ask Tensorflow to run a computation.
Create inputs x
consisting of a 2d tensor of floating point numbers
inputs_placeholder
= tf.placeholder("float",
shape=[None, INPUT_COUNT])
4. Define weights and biases from input layer to hidden layer
WEIGHT_HIDDEN
= tf.Variable(tf.truncated_normal([INPUT_COUNT, HIDDEN_COUNT]))
BIAS_HIDDEN
= tf.Variable(tf.zeros([HIDDEN_COUNT]))
A
variable is a value that lives in a Tensorflow's computation graph
that can be
modified
by the computation.
5.
Define an activation function for the
hidden layer. Here we are using the Sigmoid function, but you can use
other activation functions offered by Tensorflow.
AF_HIDDEN
= tf.nn.sigmoid(tf.matmul(inputs_placeholder, WEIGHT_HIDDEN) +
BIAS_HIDDEN)
6.
Define
weights and biases from
hidden
layer to output layer.
The
biases are initialized with tf.zeros to make sure
they
start with zero values.
WEIGHT_OUTPUT
= tf.Variable(tf.truncated_normal([HIDDEN_COUNT, OUTPUT_COUNT]))
BIAS_OUTPUT
= tf.Variable(tf.zeros([OUTPUT_COUNT]))
7
.
With one line of code we can calculate t
he
logits tensor that will contain the output that is returned
logits
= tf.matmul(AF_HIDDEN, WEIGHT_OUTPUT) + BIAS_OUTPUT
We
then compute the softmax probabilities that are assigned to each
class
y
= tf.nn.softmax(logits)
8.
The
tf.nn.softmax_cross_entropy_with_logits
op is added to compare the output logits to expected output
cross_entropy
= tf.nn.softmax_cross_entropy_with_logits(logits, y_)
It
then uses tf.reduce_mean to average the cross entropy values across
the batch dimension as the total loss
loss
= tf.reduce_mean(cross_entropy)
The
tensor that will contain the loss value will be returned
9.
Next, we instantiate a tf.train.GradientDescentOptimizer that applies
gradients with the requested learning rate.
Since Tensorflow has access to the
entire computation graph, it can find the gradients of the cost of
all the variables.
train_step
= tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(loss)
The tensor
containing the outputs of the training step is returned.
10. Next we create
a tf.Session () to run the graph
with
tf.Session() as sess:
We
initialize all the variables before we use them
init
= tf.initialize_all_variables()
Then
we run the session
sess.run(init)
For
every training loop we are going to provide the same input and
expected output data
INPUT_TRAIN
= np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
OUTPUT_TRAIN
= np.array([[1, 0], [0, 1], [0, 1], [1, 0]])
We need to create
a python dictionary object with placeholders as keys and feed
tensors as values
feed_dict
= {
inputs_placeholder: INPUT_TRAIN,
labels_placeholder: OUTPUT_TRAIN,
}
This is passed into the
The following code fetch two values [train_step, loss] in its run
call. Because there are two values to fetch, sess.run()
function's feed_dict
parameter to
provide the input examples for this step of training.sess.run()
returns a tuple with two items. We also print the loss and outputs
every 100 steps.
for
step in xrange(MAX_STEPS)
loss_val
= sess.run([train_step,
loss], feed_dict)
if
step % 100 == 0:
print "Step:",
step, "loss:
",
loss_val
for input_value in INPUT_TRAIN:
print
input_value, sess.run(y,
feed_dict={inputs_placeholder:
[input_value]})
11. When you run Tensorflow, on the
4900th step you will get a similar output as shown
[0
1] [[ 0.99858057 0.00141946]]
[0
1] [[ 0.00187515 0.9981249]]
[1
0] [[ 0.00128779 0.99871218]]
[1
1] [[ 0.99883229 0.00116773]]
12. The following points should be
noted:
- You will need to experiment with Tensorflow to create an optimized code. Play around with HIDDEN_COUNT, LEARNING_RATE AND MAX_STEPS
- You can use variety of activation functions and increase the number of hidden nodes to make your program efficient and faster.
It was really an interesting blog, Thank you for providing unknown facts.
ReplyDeleteAviation Academy in Chennai
Air hostess training in Chennai
Airport management courses in Chennai
Ground staff training in Chennai
I generally want quality content and I found that in your post. The information you have shared about Online Software Development Training Courses.....is beneficial and significant for us. Keep sharing these kinds of articles here. Thank you.
ReplyDelete