一、Multiple linear regression:
In the study of real-world problems, the changes in the dependent variable are often influenced by several important factors. In this case, it is necessary to use two or more influencing factors as independent variables to explain the changes in the dependent variable. This is known as multiple regression, also known as multiple regression. When there is a linear relationship between multiple independent variables and the dependent variable, the regression analysis conducted is called multiple linear regression.
Let's first review its univariate linear regression equation:
So, for its multiple linear regression equation, we only need to transform w and x in it:
each of and
is a set of vectors,b is a number。
二、Deeply understand:
within this function, ,
,the standard expansion formula is:
That is, the impact of number of factors on the variable:
We observe through a table
12 | 31 | 33 | 1222 |
16 | 23 | 67 | 3215 |
Each row corresponds to is a dataset, and each column represents the corresponding influencing factors.
三、Vectorization processing:
For this function:
We transformed it into
So in the program, what kind of implementation do we use and what are the differences?
Firstly, when n is not large, we can traverse:
w = np.array([1.0,2.5,-3,3])
b = 4
x = np.array([10,20,30])
f = w[0]*b[0] + w[1]*b[1] + w[2]*b[2] + b
Then, using the for loop method
f = 0
for j in range(0,n):
f = f + w[j] * x[j]
f = f + b
Finally, we can use the Numpy library:
f = np.dot(w,x) + b
Among the above three methods, it is particularly recommended to use the last one, as the Numpy function can utilize hardware parallel operations. When n is large, the difference in computation speed is particularly significant.
四、Gradient descent in multiple linear regression:
The gradient descent operation in multiple linear regression is:
..........
If the J (w, b) function is introduced, each expression of w is equal to:
Its points of attention and understanding are the same as Univariate linear regression equation。