====== Simple MNIST Logistic regression ====== {{tag>deep_learning}} In this post we should (finally) be able to perform logistic regression on the MNIST dataset, now that I have a [[blog:2018:1229_tensorflow_setup|properly setup tensorflow installation]]. ====== ====== /* Using this github repo as reference: https://github.com/darksigma/Fundamentals-of-Deep-Learning-Book */ ===== Retrieving the MNIST dataset ===== We can retrieve the MNIST dataset from the Yann LeCun website using [[https://github.com/darksigma/Fundamentals-of-Deep-Learning-Book/blob/master/fdl_examples/datatools/input_data.py|this script]]. On my side I updated the download process a bit simply to use a more consistent download location, so I use something like this: from nv.deep_learning import MNIST from nv.core.utils import * from nv.core.admin import * root_path = nvGetRootPath() logDEBUG("Retrieving MNIST dataset...") mnist = MNIST.read_data_sets(root_path+"/data/MNIST/", one_hot=True) logDEBUG("Done retrieving MNIST dataset.") ===== Building logistic network ===== Again, we use [[https://github.com/darksigma/Fundamentals-of-Deep-Learning-Book/blob/master/fdl_examples/chapter3/logistic_regression_updated.py|this script]] as a template to build our initial logistic network. In the end I didn't change much on that implementation, only removing a few unused imports basically, so the script I used for my test was: from nv.deep_learning import MNIST from nv.core.utils import * from nv.core.admin import * root_path = nvGetRootPath() logDEBUG("Retrieving MNIST dataset...") mnist = MNIST.read_data_sets(root_path+"/data/MNIST/", one_hot=True) logDEBUG("Done retrieving MNIST dataset.") import tensorflow as tf import shutil, os # Parameters learning_rate = 0.01 training_epochs = 60 batch_size = 100 display_step = 1 def inference(x): init = tf.constant_initializer(value=0) W = tf.get_variable("W", [784, 10], initializer=init) b = tf.get_variable("b", [10], initializer=init) output = tf.nn.softmax(tf.matmul(x, W) + b) w_hist = tf.summary.histogram("weights", W) b_hist = tf.summary.histogram("biases", b) y_hist = tf.summary.histogram("output", output) return output def loss(output, y): dot_product = y * tf.log(output) # Reduction along axis 0 collapses each column into a single # value, whereas reduction along axis 1 collapses each row # into a single value. In general, reduction along axis i # collapses the ith dimension of a tensor to size 1. xentropy = -tf.reduce_sum(dot_product, axis=1) loss = tf.reduce_mean(xentropy) return loss def training(cost, global_step): tf.summary.scalar("cost", cost) optimizer = tf.train.GradientDescentOptimizer(learning_rate) train_op = optimizer.minimize(cost, global_step=global_step) return train_op def evaluate(output, y): correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) tf.summary.scalar("validation error", (1.0 - accuracy)) return accuracy if __name__ == '__main__': if os.path.exists("logistic_logs/"): shutil.rmtree("logistic_logs/") with tf.Graph().as_default(): x = tf.placeholder("float", [None, 784]) # mnist data image of shape 28*28=784 y = tf.placeholder("float", [None, 10]) # 0-9 digits recognition => 10 classes output = inference(x) cost = loss(output, y) global_step = tf.Variable(0, name='global_step', trainable=False) train_op = training(cost, global_step) eval_op = evaluate(output, y) summary_op = tf.summary.merge_all() saver = tf.train.Saver() sess = tf.Session() summary_writer = tf.summary.FileWriter("logistic_logs/", graph_def=sess.graph_def) init_op = tf.global_variables_initializer() sess.run(init_op) # Training cycle for epoch in range(training_epochs): avg_cost = 0. total_batch = int(mnist.train.num_examples/batch_size) # Loop over all batches for i in range(total_batch): minibatch_x, minibatch_y = mnist.train.next_batch(batch_size) # Fit training using batch data sess.run(train_op, feed_dict={x: minibatch_x, y: minibatch_y}) # Compute average loss avg_cost += sess.run(cost, feed_dict={x: minibatch_x, y: minibatch_y})/total_batch # Display logs per epoch step if epoch % display_step == 0: print("Epoch:", '%04d' % (epoch+1), "cost =", "{:.9f}".format(avg_cost)) accuracy = sess.run(eval_op, feed_dict={x: mnist.validation.images, y: mnist.validation.labels}) print("Validation Error:", (1 - accuracy)) summary_str = sess.run(summary_op, feed_dict={x: minibatch_x, y: minibatch_y}) summary_writer.add_summary(summary_str, sess.run(global_step)) saver.save(sess, "logistic_logs/model-checkpoint", global_step=global_step) print("Optimization Finished!") accuracy = sess.run(eval_op, feed_dict={x: mnist.test.images, y: mnist.test.labels}) print("Test Accuracy:", accuracy) ===== Observed results ===== As expected, we get a **Test accuracy** of about 92% with this network. And the outputs I got were: $ nv_call_python mnist_logistic_regression.py 2018-12-30T13:23:14.077830 [DEBUG] Retrieving MNIST dataset... Extracting D:/Projects/NervSeed/data/MNIST/train-images-idx3-ubyte.gz Extracting D:/Projects/NervSeed/data/MNIST/train-labels-idx1-ubyte.gz Extracting D:/Projects/NervSeed/data/MNIST/t10k-images-idx3-ubyte.gz Extracting D:/Projects/NervSeed/data/MNIST/t10k-labels-idx1-ubyte.gz 2018-12-30T13:23:14.583517 [DEBUG] Done retrieving MNIST dataset. 2018-12-30 13:23:15.007235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335 pciBusID: 0000:01:00.0 totalMemory: 8.00GiB freeMemory: 6.60GiB 2018-12-30 13:23:15.007934: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2018-12-30 13:27:33.976381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-12-30 13:27:33.976805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2018-12-30 13:27:33.977068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2018-12-30 13:27:33.977682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6363 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1) WARNING:tensorflow:Passing a `GraphDef` to the SummaryWriter is deprecated. Pass a `Graph` object instead, such as `sess.graph`. Epoch: 0001 cost = 1.174406669 Validation Error: 0.15079998970031738 (... more epochs here...) Epoch: 0058 cost = 0.298283045 Validation Error: 0.07740002870559692 Epoch: 0059 cost = 0.297683232 Validation Error: 0.07740002870559692 Epoch: 0060 cost = 0.297116048 Validation Error: 0.07700002193450928 Optimization Finished! Test Accuracy: 0.9201 Concerning the tensorflow execution, I don't know if this was due to the fact this was a "first run" of my network, but it seemed **very slow** to me to start the training process, and then I almost instantly got the validation error values and all => I should be carefull about this performance question in the future. The "slow display" I mentioned above might actually be due to an inappropriate flushing of the stdout/stderr pipes, as I tweaked those in my current system as far as I remember. To be investigated. Now that we have some run statistics, we should be able to start **tensorboard** to display them. The **tensorboard** executable is available on Windows in the python 3 **bin/Scripts/** folder. Given my very specific python setup I had to create an additional helper script to start this app: nv_call_tensorboard() { local pname="$(uname -s)" case "${pname}" in CYGWIN*) local PREVPATH="$PATH" local pdir="`nv_get_project_dir`/tools/windows/$__nv_tool_python3/bin" export PATH="$pdir:$pdir/Scripts:$PATH" $pdir/Scripts/tensorboard.exe "$@" export PATH="$PREVPATH" ;; *) local PREVPATH="$PATH" local pdir="`nv_get_project_dir`/tools/linux/$__nv_tool_python3/bin" export PATH="$pdir:$pdir/Scripts:$PATH" $pdir/tensorboard "$@" export PATH="$PREVPATH" ;; esac } So I start tensorboard using our log folder: ultim@saturn /cygdrive/d/Projects/NervSeed/python/apps/deep_learning/mnist_logistic $ nv_call_tensorboard --logdir=logistic_logs TensorBoard 1.12.1 at http://saturn:6006 (Press CTRL+C to quit) Then we can nagivate to the page http://localhost:6006, and there we have it! My first tensorboard graph display :-): {{ blog:2018:1230:first_tensorboard.jpg?800 }} => So let's call that a successful experiment and stop here for this post! Next time, we will try to move to the [[blog:2018:1231_mnist_multilayer|multilayer perceptron implementation]] to improve our current accuracy.