Machine Learning - Logistic Regression

news2025/1/4 5:51:36

目录

一、Activation Function

Why introduce activation functions?

There are several commonly used activation functions:

二、Sigmoid:

三、Logistic Regression Model:

四、Implementation of logistic regression:

五、Decision Boundary:


一、Activation Function

        In logistic regression, the function of the activation function is to convert the output of the linear model into a probability value, so that it can represent the probability that the sample belongs to a certain category. Logistic regression is a binary classification algorithm that calculates the linear combination of input features and maps the results through an activation function to obtain a probability value between 0 and 1.

Why introduce activation functions?

        ①Converting the output of a linear model into probability values: The goal of logistic regression is to predict the probability that a sample belongs to a certain category, while the output of a linear model is a continuous real number value. By activating the function, the output of the linear model can be mapped between 0 and 1, representing the probability that the sample belongs to a certain category.
        ② Introducing non-linear relationships: The activation function introduces non-linear relationships, enabling the logistic regression model to fit non-linear data. If there is no activation function, logistic regression becomes linear regression and cannot handle non-linear classification problems.
        ③ The need for gradient calculation: The derivative of the activation function plays an important role in gradient descent algorithms. By activating the derivative of the function, the gradient of model parameters can be calculated, thereby optimizing and updating the model.

There are several commonly used activation functions:

        Sigmoid、Tanh Function、ReLU Function(Rectified Linear Unit)、Leaky ReLU、ELU Function(Exponential Linear Unit)、Softmax Function.

二、Sigmoid:

        The sigmoid function is one of the commonly used activation functions in logistic regression, which has the following characteristics that make it suitable for use in logistic regression.
        ① The output can be mapped to a probability value between 0 and 1: the output range of the sigmoid function is between 0 and 1, and the result of linear combination can be transformed into a probability value, representing the probability that the sample belongs to a certain category. This meets the classification task requirements of logistic regression.
        ② Differentiability: The sigmoid function is differentiable throughout the entire domain, which is crucial for parameter updates using optimization algorithms such as gradient descent. By taking the derivative, the gradient of the loss function on the parameters can be obtained, thereby updating the model parameters.
        ③ Having monotonicity: The sigmoid function is a monotonically increasing function, which means that as the input increases, the output also increases. This is helpful for learning and optimizing the model.
        ④ Smoothness: The sigmoid function is smooth throughout the entire domain, without any abrupt or discontinuous points. This helps to improve the stability and convergence of the model.

The complete formula is:

g(z) = \frac{1}{ 1 + exp^{-x} }

The image is as follows:

三、Logistic Regression Model:

        Logistic regression is a linear classifier (linear model) primarily used for binary classification problems. There are only two types of classification results: 1 and 0.

四、Implementation of logistic regression:

        Assuming that we determine whether a tumor is malignant or benign based on its size, we assume the following dataset:

        We assume that 1 corresponds to malignancy and 0 corresponds to benign. Then, based on linear regression, we draw a straight line in the graph, and we divide it by the midpoint 0.5 of the interval [0,1] corresponding to the y-axis.

        We can assume that when the value of the corresponding equation z = \vec{w}x + b is greater than 0.5, the corresponding tumor is benign, otherwise it is malignant. This is a situation where the dataset is relatively balanced. What if there is an outlier?

        Obviously, the results have become less reasonable, so using only linear regression to perform logistic regression is not feasible. At this point, we need to use the commonly used activation function in logistic regression:
        Firstly, we want to fix the result of y between 0 and 1, so that it is easier to determine whether the value is 0 or 1 when making discrete value predictions. So at this point, choose a function, which is the sigmoid function, which is an S-type function with a value range of (0,1), and can map a real number to the interval of (0,1), which exactly meets all the requirements.
        We can obtain the following equation by incorporating our linear regression function into the sigmoid function:

z = \vec{w}\vec{x} + b

g(z) = \frac{1}{ 1 + exp^{-x} }

f_{\vec{w},b}(\vec{x}) = g(\vec{w}*\vec{x} + b) = \frac{1}{ 1 + exp^{-(\vec{w}*\vec{x} + b)} }

        Now, we can make predictions using the above equation. Next, we will further understand what decision boundaries are.

五、Decision Boundary:

        The decision boundary of logistic regression is a hyperplane, which divides the feature space into two regions corresponding to different categories. In binary classification problems, the decision boundary can be viewed as a straight line or curve, dividing the feature space into positive and negative classes. In multi classification problems, the decision boundary can be a hyperplane or a combination of multiple hyperplanes.
        The position of the decision boundary depends on the parameters of the logistic regression model. The logistic regression model determines the optimal decision boundary by learning the relationship between features and labels in the training data. The model optimizes parameters by maximizing the likelihood function or minimizing the loss function, in order to find the optimal decision boundary.
        After using the sigmoid function, we can specify the output rules for its results as follows:

f_{\vec{w},b}(\vec{x}) >= \frac{1}{2} => y = 1

f_{\vec{w},b}(\vec{x}) < \frac{1}{2} => y = 0

        So we can easily find the corresponding linear regression equation:

f >= \frac{1}{2} => -z = -(\vec{w}\vec{x} + b ) <= 0 => z >= 0

f < \frac{1}{2} => -z = -(\vec{w}\vec{x} + b ) > 0 => z <= 0

        The straight line corresponding to its z is what we call the decision boundary:

        Of course, not all decision boundaries are linear, and there are also many nonlinear decision boundaries. We can change the shape of the decision boundaries by adding polynomials.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1540975.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Verilog刷题笔记43

题目&#xff1a;Exams/m2014 q4b 解题&#xff1a; module top_module (input clk,input d, input ar, // asynchronous resetoutput q);always(posedge clk,posedge ar)beginif(ar1)q<0;elseq<d;endendmodule结果正确&#xff1a; 补充&#xff1a; 同步复位和异步…

四、C#希尔排序算法

简介 希尔排序简单的来说就是一种改进的插入排序算法&#xff0c;它通过将待排序的元素分成若干个子序列&#xff0c;然后对每个子序列进行插入排序&#xff0c;最终逐步缩小子序列的间隔&#xff0c;直到整个序列变得有序。希尔排序的主要思想是通过插入排序的优势&#xff0…

c语言——通讯录(文件版)

大家好我是小锋&#xff0c;今天我们来实现一个通讯录 准备工作 为了让我们的代码具有条理我们要建立三个文件一个文件用来放头文件一个文件用来放函数的实现&#xff0c;一个文件用来实现通讯录的基本逻辑。 然后我们其他的.c文件要使用头文件时我们要用# include<tongxu…

#Linux(Samba安装)

&#xff08;一&#xff09;发行版&#xff1a;Ubuntu16.04.7 &#xff08;二&#xff09;记录&#xff1a; &#xff08;1&#xff09;键入命令安装Samba sudo apt-get install samba &#xff08;2&#xff09;修改samba配置文件 //打开配置文件 sudo vi /etc/samba/smb.…

GPT4.0

GPT4.0 支持官网所有功能以及所有第三方GPTS&#xff0c;完全同步官网。无需魔法&#xff0c;填写授权码直达官网。全天超18小时维护&#xff0c;无需担心不稳定。没有永久卡&#xff0c;3.5免费提供&#xff0c;4.0可以按需下单即可&#xff0c;不存在跑路。 需要的联系

【办公类-16-07-07】“2023下学期 中班户外游戏2(有场地和无场地版,每天不同场地)”(python 排班表系列)

作品展示 背景需求&#xff1a; 2024年2月教务组发放的是“每周五天内容相同&#xff0c;两周10天内容相同”的户外游戏安排 【办公类-16-07-05】合并版“2023下学期 大班户外游戏&#xff08;有场地和无场地版&#xff0c;两周一次&#xff09;”&#xff08;python 排班表系…

机器学习基础知识面经(个人记录)

朴素贝叶斯 特征为理想状态下的独立同分布&#xff0c;作为机器学习的重要基石和工具 由贝叶斯公式推导而来 是后验概率&#xff1a;在B发生的条件下A发生的概率。 是似然概率: 在 发生的条件下 发生的概率。 是先验概率: 发生的概率&#xff0c;而不考虑 的影响。 是…

Git工具的详细使用

一、环境说明 [rootgit ~]# getenforce Disabled [rootgit ~]# systemctl status firewalld ● firewalld.service - firewalld - dynamic firewall daemonLoaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)Active: inactive (d…

LeetCode-热题100:42. 接雨水

题目描述 给定 n 个非负整数表示每个宽度为 1 的柱子的高度图&#xff0c;计算按此排列的柱子&#xff0c;下雨之后能接多少雨水。 示例 1&#xff1a; 输入&#xff1a; height [0,1,0,2,1,0,1,3,2,1,2,1] 输出&#xff1a; 6 解释&#xff1a; 上面是由数组 [0,1,0,2,1,…

地宫取宝dfs

分析&#xff1a; 矩阵里的每一个位置都有标记&#xff0c;要求的问题是&#xff1a;有几种方法能完成这个规定。 那么&#xff0c;我们只需要计算从开始(1,1)到最后(n,m)的深度优先搜索中&#xff0c;有几个是满足要求的即为正确答案。 有个要求是&#xff0c;如果一个格子中…

Verilog刷题笔记44

题目&#xff1a;Consider the n-bit shift register circuit shown below: 解题&#xff1a; module top_module (input clk,input w, R, E, L,output Q );always(posedge clk)beginif(L1)Q<R;elseQ<(E1)?w:Q;endendmodule结果正确&#xff1a; 注意点&#xff1a; …

【每日力扣】332. 重新安排行程与51. N 皇后

&#x1f525; 个人主页: 黑洞晓威 &#x1f600;你不必等到非常厉害&#xff0c;才敢开始&#xff0c;你需要开始&#xff0c;才会变的非常厉害。 332. 重新安排行程 给你一份航线列表 tickets &#xff0c;其中 tickets[i] [fromi, toi] 表示飞机出发和降落的机场地点。请你…

鸿蒙Harmony应用开发—ArkTS-枚举说明

说明&#xff1a; 本模块首批接口从API version 7开始支持&#xff0c;后续版本的新增接口&#xff0c;采用上角标单独标记接口的起始版本。 Color 从API version 9开始&#xff0c;该接口支持在ArkTS卡片中使用。 颜色名称颜色值颜色示意Black0x000000 Blue0x0000ff Brown…

Elsevier(爱思唯尔)如何查询特刊special issue

1. 以Knowledge-Based Systems为例 网站&#xff1a;https://www.sciencedirect.com/journal/knowledge-based-systems 2.具体位置

数据结构面试常见问题之串的模式匹配(KMP算法)系列-简单解决方案

&#x1f600;前言 字符串匹配是计算机科学中一个常见的问题&#xff0c;指的是在一个长字符串中查找一个短字符串的出现位置。在文本编辑、生物信息学、数据挖掘等领域都有着广泛的应用。 本文将介绍 KMP 算法&#xff0c;一种用于解决字符串匹配问题的经典算法。KMP 算法可以…

向开发板上移植ip工具:交叉编译 ip工具

一. 简介 前面几篇文章学习了 CAN设备节点的创建&#xff0c;以及如何使能 CAN驱动。 本文学习向开发板上移植ip工具。 二. 向开发板上移植ip工具&#xff1a;交叉编译 ip工具 1. 移植ip工具的原因 开发板加载的文件系统&#xff08;即之前我使用 busybox工具制作的root…

Go --- 编程知识点及其注意事项

new与make 二者都是用于内存分配&#xff0c;当声明的变量是引用类型时&#xff0c;不能给该变量赋值&#xff0c;因为没有分配空间。 我们可以用new和make对其进行内存分配。 首先说说new new函数定义 func new(Type) *Type传入一个类型&#xff0c;返回一个指向分配好该…

C++例子

#include<iostream> using namespace std;//抽象类 //抽象cpu类 class CPU { public:virtual void calcuate()0; }; //抽象显卡类 class VideoCard { public:virtual void display()0; }; //抽象内存条类 class Memory { public:virtual void storage()0;};//电脑类 clas…

vue3+ts+vite axios封装请求并扩展入参

requset.ts import axios, { AxiosResponse } from axios import { getToken, removeToken } from /utils/auth// 创建axios实例 const service axios.create({baseURL: , // 所有的请求地址前缀部分(没有后端请求不用写)timeout: 60000 // 请求超时时间(毫秒)// withCredent…

LeetCode---389周赛

题目列表 3083. 字符串及其反转中是否存在同一子字符串 3084. 统计以给定字符开头和结尾的子字符串总数 3085. 成为 K 特殊字符串需要删除的最少字符数 3086. 拾起 K 个 1 需要的最少行动次数 一、字符串及其反转中是否存在同一子字符串 直接暴力枚举即可&#xff0c;代码…