====== Simple MNIST Logistic regression ======
{{tag>deep_learning}}
In this post we should (finally) be able to perform logistic regression on the MNIST dataset, now that I have a [[blog:2018:1229_tensorflow_setup|properly setup tensorflow installation]].
====== ======
/*
Using this github repo as reference:
https://github.com/darksigma/Fundamentals-of-Deep-Learning-Book
*/
===== Retrieving the MNIST dataset =====
We can retrieve the MNIST dataset from the Yann LeCun website using [[https://github.com/darksigma/Fundamentals-of-Deep-Learning-Book/blob/master/fdl_examples/datatools/input_data.py|this script]].
On my side I updated the download process a bit simply to use a more consistent download location, so I use something like this: from nv.deep_learning import MNIST
from nv.core.utils import *
from nv.core.admin import *
root_path = nvGetRootPath()
logDEBUG("Retrieving MNIST dataset...")
mnist = MNIST.read_data_sets(root_path+"/data/MNIST/", one_hot=True)
logDEBUG("Done retrieving MNIST dataset.")
===== Building logistic network =====
Again, we use [[https://github.com/darksigma/Fundamentals-of-Deep-Learning-Book/blob/master/fdl_examples/chapter3/logistic_regression_updated.py|this script]] as a template to build our initial logistic network.
In the end I didn't change much on that implementation, only removing a few unused imports basically, so the script I used for my test was:
from nv.deep_learning import MNIST
from nv.core.utils import *
from nv.core.admin import *
root_path = nvGetRootPath()
logDEBUG("Retrieving MNIST dataset...")
mnist = MNIST.read_data_sets(root_path+"/data/MNIST/", one_hot=True)
logDEBUG("Done retrieving MNIST dataset.")
import tensorflow as tf
import shutil, os
# Parameters
learning_rate = 0.01
training_epochs = 60
batch_size = 100
display_step = 1
def inference(x):
init = tf.constant_initializer(value=0)
W = tf.get_variable("W", [784, 10], initializer=init)
b = tf.get_variable("b", [10], initializer=init)
output = tf.nn.softmax(tf.matmul(x, W) + b)
w_hist = tf.summary.histogram("weights", W)
b_hist = tf.summary.histogram("biases", b)
y_hist = tf.summary.histogram("output", output)
return output
def loss(output, y):
dot_product = y * tf.log(output)
# Reduction along axis 0 collapses each column into a single
# value, whereas reduction along axis 1 collapses each row
# into a single value. In general, reduction along axis i
# collapses the ith dimension of a tensor to size 1.
xentropy = -tf.reduce_sum(dot_product, axis=1)
loss = tf.reduce_mean(xentropy)
return loss
def training(cost, global_step):
tf.summary.scalar("cost", cost)
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = optimizer.minimize(cost, global_step=global_step)
return train_op
def evaluate(output, y):
correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
tf.summary.scalar("validation error", (1.0 - accuracy))
return accuracy
if __name__ == '__main__':
if os.path.exists("logistic_logs/"):
shutil.rmtree("logistic_logs/")
with tf.Graph().as_default():
x = tf.placeholder("float", [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder("float", [None, 10]) # 0-9 digits recognition => 10 classes
output = inference(x)
cost = loss(output, y)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = training(cost, global_step)
eval_op = evaluate(output, y)
summary_op = tf.summary.merge_all()
saver = tf.train.Saver()
sess = tf.Session()
summary_writer = tf.summary.FileWriter("logistic_logs/", graph_def=sess.graph_def)
init_op = tf.global_variables_initializer()
sess.run(init_op)
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0.
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
minibatch_x, minibatch_y = mnist.train.next_batch(batch_size)
# Fit training using batch data
sess.run(train_op, feed_dict={x: minibatch_x, y: minibatch_y})
# Compute average loss
avg_cost += sess.run(cost, feed_dict={x: minibatch_x, y: minibatch_y})/total_batch
# Display logs per epoch step
if epoch % display_step == 0:
print("Epoch:", '%04d' % (epoch+1), "cost =", "{:.9f}".format(avg_cost))
accuracy = sess.run(eval_op, feed_dict={x: mnist.validation.images, y: mnist.validation.labels})
print("Validation Error:", (1 - accuracy))
summary_str = sess.run(summary_op, feed_dict={x: minibatch_x, y: minibatch_y})
summary_writer.add_summary(summary_str, sess.run(global_step))
saver.save(sess, "logistic_logs/model-checkpoint", global_step=global_step)
print("Optimization Finished!")
accuracy = sess.run(eval_op, feed_dict={x: mnist.test.images, y: mnist.test.labels})
print("Test Accuracy:", accuracy)
===== Observed results =====
As expected, we get a **Test accuracy** of about 92% with this network. And the outputs I got were: $ nv_call_python mnist_logistic_regression.py
2018-12-30T13:23:14.077830 [DEBUG] Retrieving MNIST dataset...
Extracting D:/Projects/NervSeed/data/MNIST/train-images-idx3-ubyte.gz
Extracting D:/Projects/NervSeed/data/MNIST/train-labels-idx1-ubyte.gz
Extracting D:/Projects/NervSeed/data/MNIST/t10k-images-idx3-ubyte.gz
Extracting D:/Projects/NervSeed/data/MNIST/t10k-labels-idx1-ubyte.gz
2018-12-30T13:23:14.583517 [DEBUG] Done retrieving MNIST dataset.
2018-12-30 13:23:15.007235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.60GiB
2018-12-30 13:23:15.007934: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-12-30 13:27:33.976381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-30 13:27:33.976805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-30 13:27:33.977068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-30 13:27:33.977682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6363 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
WARNING:tensorflow:Passing a `GraphDef` to the SummaryWriter is deprecated. Pass a `Graph` object instead, such as `sess.graph`.
Epoch: 0001 cost = 1.174406669
Validation Error: 0.15079998970031738
(... more epochs here...)
Epoch: 0058 cost = 0.298283045
Validation Error: 0.07740002870559692
Epoch: 0059 cost = 0.297683232
Validation Error: 0.07740002870559692
Epoch: 0060 cost = 0.297116048
Validation Error: 0.07700002193450928
Optimization Finished!
Test Accuracy: 0.9201
Concerning the tensorflow execution, I don't know if this was due to the fact this was a "first run" of my network, but it seemed **very slow** to me to start the training process, and then I almost instantly got the validation error values and all => I should be carefull about this performance question in the future.
The "slow display" I mentioned above might actually be due to an inappropriate flushing of the stdout/stderr pipes, as I tweaked those in my current system as far as I remember. To be investigated.
Now that we have some run statistics, we should be able to start **tensorboard** to display them. The **tensorboard** executable is available on Windows in the python 3 **bin/Scripts/** folder. Given my very specific python setup I had to create an additional helper script to start this app:
nv_call_tensorboard()
{
local pname="$(uname -s)"
case "${pname}" in
CYGWIN*)
local PREVPATH="$PATH"
local pdir="`nv_get_project_dir`/tools/windows/$__nv_tool_python3/bin"
export PATH="$pdir:$pdir/Scripts:$PATH"
$pdir/Scripts/tensorboard.exe "$@"
export PATH="$PREVPATH"
;;
*)
local PREVPATH="$PATH"
local pdir="`nv_get_project_dir`/tools/linux/$__nv_tool_python3/bin"
export PATH="$pdir:$pdir/Scripts:$PATH"
$pdir/tensorboard "$@"
export PATH="$PREVPATH"
;;
esac
}
So I start tensorboard using our log folder:
ultim@saturn /cygdrive/d/Projects/NervSeed/python/apps/deep_learning/mnist_logistic
$ nv_call_tensorboard --logdir=logistic_logs
TensorBoard 1.12.1 at http://saturn:6006 (Press CTRL+C to quit)
Then we can nagivate to the page http://localhost:6006, and there we have it! My first tensorboard graph display :-):
{{ blog:2018:1230:first_tensorboard.jpg?800 }}
=> So let's call that a successful experiment and stop here for this post! Next time, we will try to move to the [[blog:2018:1231_mnist_multilayer|multilayer perceptron implementation]] to improve our current accuracy.