How to calculate the forward propagation of shallow neural networks

Friends, if you need to reproduce please indicate the source public number: jack bed long

For newcomers, please click on "History Articles" in the public website and read from the preface, otherwise you may not understand this article

Before I talk about AI, I'll talk about the rest in the first paragraph, about something I want to talk about, about something that people need to know, about something that will help you in your life. I talked about Ukraine in a previous post and quite a few of you were interested in leaving me a comment. So I'll say more about Ukraine. I also went to Ukraine because I read some articles online that there are a lot of beautiful women in Ukraine and it's easy for Chinese people to find girls there. In fact, the internet is only half right, there are plenty of beautiful women, but it's not good for Chinese people to find girls when they go there. First of all they are very pro-American, they have American flags flying in their bars; at the same time they don't speak English (Russian is their second most spoken language). I think Ukraine is a country with no racial pride, a lot of their pretty girls including college students go into prostitution and the government turns a blind eye to it. It feels like Ukraine has become the red light district of America, with a lot of old retired American men going to prostitute Ukrainian college students. Well, that's it for Ukraine today, so if you're still interested, drop me a line and I'll talk more about Ukraine in a later post.

Previously we have taught you to compute single neuron networks and led you step by step through the first AI program. So how do you compute a multi-neuron network? This article shows you how to calculate the forward propagation of shallow (two-layer) neural networks.

As shown below, we can first compute a for each neuron in the first layer separately. The upper foot marker in the formula indicates the layer and the lower corner marker indicates the first few neurons of that layer.

The first layer has four neurons, so we need to compute it four times. But what if there were a hundred or even a thousand neurons? It's too inefficient to count them one by one. As you can see from my previous post, we can vectorize it.

To vectorize, the key point is to combine the 4 weighted row vectors of the first layer into a matrix as shown below. w1[1]T is a row vector representing the 3 weights of the first neuron of the first layer with respect to the 3 inputs x, and w2[1]T is the 3 weights of the second neuron of the first layer with respect to the 3 inputs x. Four such row vectors form a 4*3 matrix.

From what we learned earlier about matrix multiplication, it follows that each weight row vector is multiplied by the feature column vector x, as shown below.

So eventually the 4 sets of equations above are vectorized into the set below - which computes a for all neurons in the first layer at once (regardless of how many neurons there are). The following w[1] represents a 4*3 matrix consisting of a combination of 4 weighted row vectors; b[1] is a column vector that contains the 4 biases b associated with the 4 neurons.

Similarly, the following set of equations gives the second level of a[2]. is actually the same as the equation above, except that x becomes a[1].

The above equation is for computing a single training sample, but training a neural network often requires a very large number of training samples, and the following code is used to compute multiple training samples. The following m denotes the number of training samples, and the for loop traverses each training sample.

The code above is right, but it's too inefficient. We should have vectorized it too, removing the for loop. The key point in trying to vectorize them is also in combining the eigencolumn vectors x of each sample into a matrix. As shown in the figure below.

In this way, the above for loop code can be vectorized into the following form.

From the knowledge of vectorization presented in the previous article, it follows that the computed results Z and A are also matrices. (If you can't read it, please review my previous post)

Each column vector in the matrix corresponds to a sample. For example z[1](1) is the z of the first sample first layer. a[1](2) is the a of the first layer of the second sample.

After getting the last layer of A (that is, A[2]), we can calculate the cost by using the following equation. is the same as the previous single neuron network.

J = -np.sum(Y*np.log(A[2]) + (1-Y)*np.log(1-A[2]))/ m

With the forward propagation of multi-neuron networks taken care of, the next article will give you an overview of how to calculate the backward propagation of shallow neural networks.

Please add me on WeChat, after which I will use WeChat to announce the answers to test questions and some announcements to you, as well as to answer common questions you encounter in a unified manner, and to recruit people to work together on projects. Please mention "artificial intelligence" when you add me.