9 - Deep Learning of Neural Network Core Principles and Algorithms - Fully Connected Network GPU Implementation

Fully connected network GPU implementation

  • Join Dropout
  • Encapsulating the fully connected layer

Previously our fully connected network could only run on the cpu, which we retrofitted to a gpu implementation.

Encapsulating the fully connected layerInto a class。

Because we're going to use theano

WARNING (theano.configdefaults): g++ not available, if using conda: `conda install m2w64-toolchain`
D:softEnvDownAnaconda2envspy3libsite-packages	heanoconfigdefaults.py:560: UserWarning: DeprecationWarning: there is no c++ compiler.This is deprecated and with Theano 0.11 a c++ compiler will be mandatory
  warnings.warn("DeprecationWarning: there is no c++ compiler."
WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

Based on the reported error, we installconda install -c msys2 m2w64-toolchain

Since we're putting our code on the gpu for computation, we introduce theano

import theano.tensor as T
from theano.tensor import shared_randomstreams
from theano.tensor.nnet import sigmoid

introducetheano innertensor Many math-related functions are provided。shared_randomstreams Help us take random values。 Use the providedsigmoid incentive function。

#  Whether to useGPU
GPU = True
if GPU:
    print("Trying to run under a GPU.  If this is not desired, then modify " + 
to set the GPU flag to False.")
        theano.config.device = 'gpu'
        pass  # it's already set
    theano.config.floatX = 'float32'
    print("Running with a CPU.  If this is not desired, then the modify " + 
          "network3.py to set
the GPU flag to True.")

If you use gpu, define two variables for theanotheano.config.device harmony theano.config.floatX = 'float32'

Report an error:

You can find the C code in this temporary file: C:UsersmtianAppDataLocalTemp	heano_compilation_error__dohjzex
Traceback (most recent call last):
  File "D:softEnvDownAnaconda2envspy3libsite-packages	heanogoflazylinker_c.py", line 75, in <module>
    raise ImportError()


conda install mingw libpython

The implementation of the full connection is encapsulated in a class:

class FullyConnectedLayer(object):
    def __init__(self, n_in, n_out, activation_fn=sigmoid, p_dropout=0.0):
        #  Number of neurons in the previous layer
        self.n_in = n_in
        #  Number of neurons in the latter layer
        self.n_out = n_out
        #  incentive function
        self.activation_fn = activation_fn
        # dropout
        self.p_dropout = p_dropout
        #  Initialization weights
        self.w = theano.shared(
                    loc=0.0, scale=np.sqrt(1.0 / n_out), size=(n_in, n_out)),
            name='w', borrow=True)
        #  Initialization bias
        self.b = theano.shared(
            np.asarray(np.random.normal(loc=0.0, scale=1.0, size=(n_out,)),
            name='b', borrow=True)
        self.params = [self.w, self.b]

    def set_inpt(self, inpt, inpt_dropout, mini_batch_size):
        # reshape importation
        self.inpt = inpt.reshape((mini_batch_size, self.n_in))
        #  exports
        self.output = self.activation_fn(
            (1 - self.p_dropout) * T.dot(self.inpt, self.w) + self.b)
        #  take the maximum value
        self.y_out = T.argmax(self.output, axis=1)
        # dropout importationdropout
        self.inpt_dropout = dropout_layer(
            inpt_dropout.reshape((mini_batch_size, self.n_in)), self.p_dropout)
        # dropout exports
        self.output_dropout = self.activation_fn(
            T.dot(self.inpt_dropout, self.w) + self.b)

    def accuracy(self, y):
        return T.mean(T.eq(y, self.y_out))

n_in Indicates how many neurons the previous layer of the network has, n_out indicates the total number of neurons in the latter layer, incentive function( Default usesigmoid function) p_Dropout How many neurons are discarded( percentages)

Save the incoming parameters to the class's private variables.

Initialize the weights self.w

usetheano.shared method can put our weight values in thegpu on the operation。 Here we use the normal distribution, The average value is0, variance is(1/ Number of neurons in the latter layer) radical sign。 How many weights are there in total (n_in, n_out) First floor times next floor。 The data types of arrays can be usedtheano innerfloatx indicating that( Easy to put ingpu run on)。 Set a namename w。borrow Set totrue Indicates a shared variable。


Initialize bias self.b

Initialize the array using a normal distribution with mean 0,variance 1. The size of the array is n_out.The bias is equal to the number of neurons in the next layer. The data type of the array can be represented as floatx inside theano (easy to put on the gpu and run).

self.params puts the w and b just initialized inside a list.

set_input function

Set the value we need by this method.

Parameters: The output of the previous layer, and the output of the previous layer after Dropout as the input of this layer, the size of mini_batch.

reshape a bit inpt. In total, there are mini_batch bars of data, each with as many columns as n_in.

The output of this fully connected layer, the linear part wx+b goes through the excitation function.

Introduceddropout。 that is exports The result of how many neurons to keep when the。

Take the maximum value of the output. If still doing recognition of handwritten numbers, the output has a total of 10 types of 0-9. Taking the maximum value is also its maximum possibility.

For the fully connected layer input we have to do a little Dropout reshape.

The drop_layer function is implemented below:

def dropout_layer(layer, p_dropout):
    srng = shared_randomstreams.RandomStreams(
    mask = srng.binomial(n=1, p=1 - p_dropout, size=layer.shape)
    return layer * T.cast(mask, theano.config.floatX)

It's all about how much data to keep.

Do a Dropout on the output as well

directly harmony importation currentDropout harmonyw Do the inner product plus the bias。

Calculate the output of this layer and the output after doing the Dropout.

Define a function that calculates the accuracy

    def accuracy(self, y):
        return T.mean(T.eq(y, self.y_out))

The true value of the incoming network is compared with the predicted value. The average was taken to obtain the accuracy.

Modifications to our Network class.

class Network(object):
    def __init__(self, layers, mini_batch_size):
        self.layers = layers
        self.mini_batch_size = mini_batch_size
         # Put the parameters of each layer into a list
        self.params = [param for layer in self.layers for param in layer.params]
         # Initialize x, y
        self.x = T.matrix("x")
        self.y = T.ivector("y")
         # Initialize the first layer
        init_layer = self.layers[0]
        init_layer.set_inpt(self.x, self.x, self.mini_batch_size)
         # Initialize each layer behind
        for j in range(1, len(self.layers)):
            prev_layer, layer = self.layers[j - 1], self.layers[j]
                prev_layer.output, prev_layer.output_dropout, self.mini_batch_size)
         # Output of the final layer
        self.output = self.layers[-1].output
        self.output_dropout = self.layers[-1].output_dropout

Constructor, how many layers to pass in total. Put the parameters of each layer into a list.

First iterate through each layer, then take the parameters of each layer inside each layer.

Initialize the training data and its corresponding labels. The label is a 0-9 vector x is a matrix of 28,28.

Set up a bit of input for the first time on the network, after Dropout. Since this is the first layer, the first parameter and the second parameter are the same in the original image at this point.

Initialize each layer of the network behind it. Start at the second level and work your way up to the last level.

Take the previous layer and the current layer of the network in this layer. Set parameters for the network of the current layer: output of the previous layer, Dropout output of the previous layer.

Let's keep the last layer ofoutput。 final layerDropout following exports。-1 It's taking the last layer。

gradient descent function

    def SGD(self, training_data, epochs, mini_batch_size, eta,
            validation_data, test_data, lmbda=0.0):
        training_x, training_y = training_data
        validation_x, validation_y = validation_data
        test_x, test_y = test_data

        num_training_batches = size(training_data) / mini_batch_size
        num_validation_batches = size(validation_data) / mini_batch_size
        num_test_batches = size(test_data) / mini_batch_size

eta learning rate The validation dataset is what helps our network tune the parameters during training, allowing you to find out which hyperparameters are problematic.

The test dataset is what helps us validate how good the model we end up training is. lmbda is the parameter used for regularization

Save the training data set harmony The validation data sets corresponding to the respectivex harmonyy。

We calculate the total number of batches to be divided into verification, testing, and training

L2 regularization



The original loss function + this term after the plus sign: the L2 canonical term

the cumulative sum of squares of w.


the original C0 and our L2 regularization term.

Calculate the bias guide.

It turns out that we implemented this manually, directly calling the method grad provided inside theano to calculate the gradient

parameters: damages, Parameters of each layer of neuronsw harmonyb

Update the parameters of each layer:


w-learning rate* bias derivative.

Set the update equation

index is a scalar i

The first parameter is i, the second layer of parameters is cost, and the third parameter is the update parameter above

x of the training dataset at input, 0-60 60-120

Validate accuracy on the dataset. Call the last layer. If the last layer is a fully connected layer, the call to The fully connected layer we just implemented to calculate the accuracy function. Throw in the actual label y corresponding to x.

Organize the batches.

Start training.

 #  Start training.
        best_validation_accuracy = 0.0
        for epoch in range(epochs):
            for minibatch_index in range(num_training_batches):
                iteration = num_training_batches * epoch + minibatch_index
                if iteration % 1000 == 0:
                    print("Training mini-batch number {0}".format(iteration))
                cost_ij = train_mb(minibatch_index)
                if (iteration + 1) % num_training_batches == 0:
                    validation_accuracy = np.mean(
                        [validate_mb_accuracy(j) for j in range(num_validation_batches)])
                    print("Epoch {0}: validation accuracy {1:.2%}".format(
                        epoch, validation_accuracy))
                    if validation_accuracy >= best_validation_accuracy:
                        print("This is the best validation accuracy to date.")
                        best_validation_accuracy = validation_accuracy
                        best_iteration = iteration
                        if test_data:
                            test_accuracy = np.mean(
                                [test_mb_accuracy(j) for j in range(num_test_batches)])
                            print('The corresponding test accuracy is {0:.2%}'.format(
        print("Finished training network.")
        print("Best validation accuracy of {0:.2%} obtained at iteration {1}".format(
            best_validation_accuracy, best_iteration))
        print("Corresponding test accuracy of {0:.2%}".format(test_accuracy))

How many rounds of training in total and how many putbatches are trained inside each round Take one batch at a time for training.

Print every however many steps. How many minibatches are currently trained to. Call our train_mb function for training.

After each training round, validation is performed using our validation dataset. This previous round was accurate on the validation dataset.

If the accuracy on the validation dataset is greater than the previous best validation dataset accuracy. Indicates an update.

And save the number of steps it trained.

1、pwawebpack a first look and a treadmill 0 Preface 1 webpack 2 pwa 3 webpackbased pwa
2、Redis Performance Troubleshooting Solution Manual
3、Implementing React from 0 to 1 Series Lifecycle and Diff Algorithm
4、Chopping and slicing fruit at the same time VirtualNinjaVR is now available on Steam
5、Different page communication with crossdomain 0 Preface 1 localstorage 2 playing with iframe 3 nonsame domain two tab page communication 4 MessageChannel

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送