Tensorflow models dont fit when defining forward pass in a function

use the following search parameters to narrow your results:

find submissions in subreddit

find submissions by username

search for text in self post contents

include (or exclude) results marked as NSFW

e.g.subreddit:aww site:imgur.com dog

advanced search: by author, subreddit…

Andrew Ng and Adam Coates (4/15/2015)

Please have a look atour FAQ and Link-Collection

Metacademyis a great resource which compiles lesson plans on popular machine learning topics.

For Beginner questions please try/r/LearnMachineLearning/r/MLQuestionsor

This is an archived post. You wont be able to vote or comment.

Tensorflow models dont fit when defining forward pass in a function(self.MachineLearning)

submitted2 years agobynapsternxg

This is a very naive question but one I am struggling with. I am training a model using MNIST on tensorflow.

What I am trying to do is to wrap the forward pass in a function so that I dont have to write all the computation again for inference.

However, with my implementation of forward pass in the function, the model doesnt learn and the validation accuracy stays put at: 9%. Without using the forward function approach, my model learns normally and I achieve a validation accuracy around 92%.

I tried playing with d_scope but it didnt help. What I believe the issue is that that in each call of the forward function the Weights and biases are getting reset because these variables are defined in the function. However, a similar approach works in the official example in the tensorflow repo at:

Can someone suggest what am I doing wrong ? I am new to tensorflow so finding this a bit confusing.

batch_size = 100 n_hidden = 1024 L2_weight = 0.5e-3 def forward(tf_X): assert tf.shape(tf_X)[1] == image_size*image_size,\ Training data not of correct shape. Each input should be of shape: %s % (image_size*image_size) with tf.name_scope(hidden1): weights = tf.Variable(tf.truncated_normal([image_size*image_size, n_hidden]), name=weights) biases = tf.Variable(tf.zeros([n_hidden]), name=biases) z01 = tf.matmul(tf_X, weights) + biases hidden1 = tf.nn.relu(z01) l2_reg_01 = tf.nn.l2_loss(weights) with tf.name_scope(z12): weights = tf.Variable(tf.truncated_normal([n_hidden, num_labels]), name=weights) biases = tf.Variable(tf.zeros([num_labels]), name=biases) z12 = tf.matmul(hidden1, weights) + biases l2_reg_12 = tf.nn.l2_loss(weights) return z12, l2_reg_01+l2_reg_12 Define loss def get_loss(z12, l2_loss, tf_Y): assert tf.shape(tf_X)[1] == image_size*image_size,\ Training data not of correct shape. got %s require %s % (tf.shape(tf_X)[1], image_size*image_size) assert tf.shape(tf_Y)[1] == num_labels,\ Training data not of correct shape. got %s require %s % (tf.shape(tf_Y)[1], num_labels) loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(z12, tf_training_labels)) total_loss = loss + L2_weight*l2_loss return total_loss Define the network graph graph = tf.Graph() with graph.as_default(): tf_training_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size*image_size)) tf_training_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) tf_training_dataset = tf.placeholder(tf.float32) Should have shape (batch_size, image_size*image_size) tf_training_labels = tf.placeholder(tf.float32) Should have shape (batch_size, num_labels) tf_valid_dataset = tf.constant(valid_dataset) tf_test_dataset = tf.constant(test_dataset) Define network parameters weights_01 = tf.Variable(tf.truncated_normal([image_size*image_size, n_hidden])) weights_12 = tf.Variable(tf.truncated_normal([n_hidden, num_labels])) biases_01 = tf.Variable(tf.zeros([n_hidden])) biases_12 = tf.Variable(tf.zeros([num_labels])) Define network operations z01 = tf.matmul(tf_training_dataset, weights_01) + biases_01 h1 = tf.nn.relu(z01) z_12 = tf.matmul(h1, weights_12) + biases_12 Optimize the loss loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(z_12, tf_training_labels)) regularized_loss = tf.nn.l2_loss(weights_01) + tf.nn.l2_loss(weights_12) total_loss = loss + L2_weight*regularized_loss z12, l2_loss = forward(tf_training_dataset) total_loss = get_loss(z12, l2_loss, tf_training_labels) optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(total_loss) Predictions using the model: train_predictions = tf.nn.softmax(z_12) valid_scores = tf.matmul(tf.nn.relu(tf.matmul(tf_valid_dataset, weights_01) + biases_01), weights_12) + biases_12 test_scores = tf.matmul(tf.nn.relu(tf.matmul(tf_test_dataset, weights_01) + biases_01), weights_12) + biases_12 valid_predictions = tf.nn.softmax(valid_scores) test_predictions = tf.nn.softmax(test_scores) train_predictions, _l2 = forward(tf_training_dataset) valid_predictions, _l2 = forward(tf_valid_dataset) test_predictions, _l2 = forward(tf_test_dataset) train the model num_steps = 3001 batch_size = 100 with tf.Session(graph=graph) as session: itialize_all_variables().run() print Initialized, using batch size: %s % batch_size for step in xrange(num_steps): idx = np.random.randint(train_dataset.shape[0], size=batch_size) offset = (step * batch_size) % (train_labels.shape[0] – batch_size) Generate a minibatch. batch_data = train_dataset[idx] batch_labels = train_labels[idx] batch_data = train_dataset[offset:(offset + batch_size), :] batch_labels = train_labels[offset:(offset + batch_size), :] feed_dict = tf_training_dataset : batch_data, tf_training_labels : batch_labels _, l, predictions = session.run([optimizer, total_loss, train_predictions], feed_dict=feed_dict) if (step % 500 == 0): batch_size += 100 print Updated batch size: %s % batch_size print Minibatch loss at step, step, :, l print Minibatch accuracy: %.1f%% % accuracy(predictions, batch_labels) print Validation accuracy: %.1f%% % accuracy(valid_predictions.eval(), valid_labels) print Test accuracy: %.1f%% % accuracy(test_predictions.eval(), test_labels)

Initialized, using batch size: 100 Updated batch size: 200 Minibatch loss at step 0 : 351.273 Minibatch accuracy: 5.0% Validation accuracy: 8.7% Updated batch size: 300 Minibatch loss at step 500 : 19.3078 Minibatch accuracy: 12.5% Validation accuracy: 8.7% Updated batch size: 400 Minibatch loss at step 1000 : 4.44503 Minibatch accuracy: 9.3% Validation accuracy: 8.7% Updated batch size: 500 Minibatch loss at step 1500 : 7.71385 Minibatch accuracy: 13.2% Validation accuracy: 8.7% Updated batch size: 600 Minibatch loss at step 2000 : 2.21557 Minibatch accuracy: 11.8% Validation accuracy: 8.7%

1 point2 points3 points2 years ago(0 children)

/r/MLQuestionsis a better place for questions of this sort 🙂

[S]0 points1 point2 points2 years ago(0 children)

Thanks. I was able to solve the issue. Example code can be found at:

Use of this site constitutes acceptance of ourUser AgreementandPrivacy Policy. © 2018 reddit inc. All rights reserved.

REDDIT and the ALIEN Logo are registered trademarks of reddit inc.

Rendered by PID 94887 on app-268 at 2018-02-03 17:31:33.769716+00:00 running 1cce75d country code: CN.