Understanding the Overheads of Launching CUDA Kernels (理解启动 CUDA Kernels 的开销)

news2025/1/9 1:05:27

Understanding the Overheads of Launching CUDA Kernels {理解启动 CUDA Kernels 的开销}

  • Understanding the Overheads of Launching CUDA Kernels
  • 1. INTRODUCTION
  • 2. MICRO-BENCHMARKS USED IN OUR STUDY
  • 3. OVERHEAD OF LAUNCHING KERNELS
    • 3.1. Experimental Environment
    • 3.2. Launch Overhead in Small Kernels
    • 3.3. Launch Overhead in Large Kernels
    • 3.4. Other Launch Overheads
    • 3.5. Conclusion
  • 4. REFERENCES
  • Understanding the Overheads of Launching CUDA Kernels
  • 1. Motivation
  • 2. Background
  • 3. Micro-benchmark
  • 4. Launch Overhead in Small Kernels
  • 5. Launch Overhead in Large Kernels
  • 6. Other Overheads
  • 7. Conclusion
  • 8. References
  • References

Understanding the Overheads of Launching CUDA Kernels

https://www.hpcs.cs.tsukuba.ac.jp/icpp2019/data/posters/Poster17-abst.pdf

Lingqi Zhang 1 ^1 1, Mohamed Wahib 2 ^2 2, Satoshi Matsuoka 13 ^{1 3} 13
zhang.l.ai@m.titech.ac.jp
1 ^1 1Tokyo Institute of Technology, Dept. of Mathematical and Computing Science, Tokyo, Japan
2 ^2 2AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory
3 ^3 3RIKEN Center for Computational Science, Hyogo, Japan

Tokyo Institute of Technology:东京工业大学
AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory,RWBC-OIL
Hyogo:兵库县
RIKEN Center for Computational Science:理研计算科学中心

1. INTRODUCTION

GPU computing is becoming more and more important in the field of general computing. Many scientific areas utilize the performance of GPUs. Several classes of algorithms require device-wide synchronization, through the use of barriers. However, thousands of threads running on independent SMs (Streaming Multi-Processors) impede this task. Previous research [3] proposed two kinds of device-wide barriers: software barriers or implicit barriers. Recently, Nvidia proposed new methods to do device-wide barriers, i.e. grid synchronization and multi-grid synchronization [1]. Based on the possibility of achieving high performance from lower occupancy [2], we envision using a single kernel with several barriers instead of using multiple kernels as an implicit barrier. But we need to understand the penalty of using different kinds of barriers, i.e. new explicit barrier functions and implicit barrier.
通过使用 barriers,几类算法需要设备范围的同步。然而,在独立 SMs (Streaming Multi-Processors) 上运行的数千个线程阻碍了这项任务。
基于以较低的占用率实现高性能的可能性 [2],我们设想使用具有多个 barriers 的单个内核,而不是使用多个内核作为隐式 barrier。但我们需要了解使用不同类型 barriers 的代价,即新的显式 barrier 函数和隐式 barrier。

impede [ɪmˈpiːd]:v. 阻碍,阻止
envision [ɪn'vɪʒ(ə)n]:v. 想象,展望

Additionally, Nvidia has proposed new launch functions (e.g. cooperative launch and multi-cooperative launch). These functions are used to support grid synchronization and multi-grid synchronizations [1], i.e. the new explicit barrier functions. In order to utilize the new features, programmers need to turn to the new launch functions. But there is no research try to study the penalty of turning into these new launch functions.
此外,Nvidia 还提出了新的启动函数 (例如协作启动和多协作启动)。这些函数用于支持网格同步和多网格同步 [1],即新的显式 barrier 函数。为了利用新特性,程序员需要转向新的启动函数。但目前还没有研究尝试研究转向这些新启动函数的惩罚。

cooperative [kəʊˈɒp(ə)rətɪv]:adj. 合作的,协作的,同心协力的,协助的 n. 合作企业,合作社组织
explicit [ɪk'splɪsɪt]:adj. 清楚明白的,易于理解的,明确的,直言的

In this research we will use micro-benchmark to understand the overheads hidden in launch functions. And try to identify the cases when it is not profitable to launch additional kernels. We will also try to make a better understanding of differences in the different launch functions in CUDA.

2. MICRO-BENCHMARKS USED IN OUR STUDY

Throughout this abstract, we use the following terminologies:

  • Kernel Latency: Total latency to run kernels, start from CPU thread launching a thread, end at CPU thread noticing that the kernel is finished.
    运行内核的总延迟,从 CPU 启动线程开始,到 CPU 检测到内核完成时结束。
  • Kernel Overhead: Latency that is not related to kernel execution.
    非 kernel 执行部分的延迟。
  • Additional Latency: Considering that CPU thread has just called a kernel launch function, additional latency is the additional latency to launch an additional kernel.
    额外的延迟是 CPU 刚刚调用了一个 kernel launch function 后,启动另一个 kernel launch function 的额外延迟。
  • CPU Launch Overhead: Latency of CPU calling a launch function.
    CPU 调用启动函数的延迟。
  • Small Kernel: Kernel execution time is not the main reason for additional latency.
    内核执行时间不是造成额外延迟的主要原因。
  • Larger Kernel: Kernel execution time is the main reason for additional latency.
    内核执行时间是造成额外延迟的主要原因。

Currently, researchers tend to either use the execution time of empty kernels or the execution time of a CPU kernel launch function as an overhead of launching a kernel. Although those methods might work correctly when considering a single GPU kernel, this is not enough in the case of multi-kernels. Under this circumstance, we mainly focus on the overhead for launching an additional Kernel.
目前,研究人员倾向于使用空内核的执行时间或 CPU 内核启动函数的执行时间作为启动内核的开销。虽然这些方法在考虑单个 GPU 内核时可能正确工作,但在多内核的情况下这还不够。在这种情况下,我们主要关注启动额外内核的开销。

We use the sleep instruction to control the kernel latency. Sleep instruction is only available in Volta architecture. This instruction is very light, and according to our experiments, no matter how many times we repeat this instruction, the overhead of the kernel remains unaffected.
我们使用 sleep 指令来控制内核延迟。 sleep 指令仅在 Volta 架构中可用。该指令非常轻量,根据我们的实验,无论我们重复该指令多少次,内核的开销都不会受到影响。

We use several sleep instructions to compose a wait unit. Different wait unit inside a single kernel represent a valid kernel execution latency.
我们使用多条 sleep 指令组成一个等待单元。单个内核内的不同等待单元代表一个有效的内核执行延迟。

This micro-benchmark consist of two different kinds of variable:

  • The times to launch a kernel
  • the numbers of wait units inside a single kernel. In a single experiment, wait unit should be settled.
settle ['set(ə)l]:v. 定居,结算,停留,确定 n. 高背长椅
distinctive [dɪ'stɪŋktɪv]:adj. 独特的,特别的,有特色的

To test the overhead of small kernels, we propose to use a null kernel (no code inside) as an example of a small kernel. In this situation, the overhead can be computed with the formula 1.
小 kernel 主要是指执行时间不长的 kernel,本测试中采用 wait unit 循环次数为 0 来表示小 kernel,相当于 kernel 内部不做任何操作。

O = L a t e n c y i 0 − L a t e n c y j 0 i − j (1) O = \frac{Latency_{i0} - Latency_{j0}}{i - j} \tag{1} O=ijLatencyi0Latencyj0(1)

O O O represents Overhead; i i i, j j j represents call launch function times; 0 0 0 represents 0 0 0 wait unit inside a kernel.

To test the overhead of a large kernel, we propose to use kernel fusion to unveil the overhead hidden in kernel latency. The details of this method is shown in Figure 1. In this situation, the overhead can be computed with the formula 2.
大 kernel 是指执行时间占主要开销的 kernel。

O = L a t e n c y i j − L a t e n c y j i i − j (2) O = \frac{Latency_{ij} - Latency_{ji}}{i - j} \tag{2} O=ijLatencyijLatencyji(2)

O O O represents Overhead; In L a t e n c y i j Latency_{ij} Latencyij (the left one), i i i represents call launch function i i i times, j j j represents launch kernels with j j j wait unit.
测量公式中 Latency 前后两个下标代表不同的维度信息。以 L a t e n c y i j Latency_{ij} Latencyij 为例, i i i 代表这个 kernel 被重复的 launch 多少次, j j j 代表每一个 kernel 内部重复多少次 wait unit 操作。

在这里插入图片描述
Figure 1: Using kernel fusion to test the execution overhead

Wait Unit 是 kernel 中的 sleep 函数,每调用一次 sleep 1 us,代码可见 micro-benchmark 图。sleep 的底层实现基于 Volta 架构提供的汇编指令 nonosleep.u32,据测试该指令的开销极低,经测试验证 sleep 时间长短(即 kernel 执行时间长短)并不影响其它开销。

Table 1: Environment Information
在这里插入图片描述

在这里插入图片描述
Figure 2: Comparison of null kernel overhead for different launch functions

3. OVERHEAD OF LAUNCHING KERNELS

3.1. Experimental Environment

Since we utilize the sleep instruction as a tool to analyze launch overhead, which is only available in Volta Platform in CUDA, we only conduct experiments in the V100 GPU. Table 1 shows the environment information. Each result presented is the average result of 100 experiments.

3.2. Launch Overhead in Small Kernels

We found that latency of CPU Launch Overhead to be nearly equal to the latency of the additional kernel. We hereby additionally plot the latency of the launch function in Figure 2.
我们发现 CPU Launch Overhead 的延迟几乎等于附加内核的延迟。因此,我们在图 2 中额外绘制了启动函数的延迟。

Considering the system error, it is relatively safe to assume that the time consumed when the CPU launches a kernel is the main source of latency among all other steps in kernel launch.
考虑到系统错误,可以相对安全地假设 CPU 启动内核时所消耗的时间是内核启动中所有其他步骤中延迟的主要来源。

3.3. Launch Overhead in Large Kernels

In a single node, we use 5 workload units (sleep 5000 ns). Figure 3 shows that the additional latency is larger than the CPU launch overhead, which means CPU launch overhead do not influence additional latency. And using the kernel fusion method, we found that the execution overhead does exist.
在单节点中,我们使用 5 个工作负载单元 (休眠 5000 ns),图 3 显示额外延迟大于 CPU 启动开销,说明 CPU 启动开销不影响额外延迟,而使用核融合方法发现执行开销确实存在。

We only prove that this kind of overhead exists in this work. The relation between the execution overhead and how complex the kernel is as well as the launch parameters might is future work. In real-world workloads, the actual execution overhead might be larger than what we are reporting now.
我们只是在这项工作中证明了这种开销的存在。执行开销与内核复杂程度以及启动参数之间的关系可能是未来的工作。在实际工作负载中,实际执行开销可能比我们现在报告的要大。

3.4. Other Launch Overheads

We observe that apart from the overhead of CPU launching kernel and GPU execution overhead, there are remaining overheads.
我们观察到,除了 CPU 启动内核的开销和 GPU 执行的开销之外,还有其他开销。

We use formula 3 to compute that kind of overheads.

O O t h e r = O T o t a l − ( O C P U L a u n c h K e r n e l + O E x e c u t i o n ) (3) O_{Other} = O_{Total} - (O_{CPU Launch Kernel} + O_{Execution}) \tag{3} OOther=OTotal(OCPULaunchKernel+OExecution)(3)

O O O represents Overhead;

The result is shown in figure 4. Although the overheads seem large, it does not play an important role when launching a large number of kernels.
结果如图 4 所示,虽然开销看起来很大,但是在启动大量内核时,它并不起到重要作用。

在这里插入图片描述
Figure 3: Large kernel launch overhead of different launch methods

在这里插入图片描述
Figure 4: Comparison of different overheads in different launch functions

如果不是连续发射一系列的 kernels 而是单次触发 kernel 的话,除了执行开销和发射开销,还有一项是其它开销,这项开销甚至超过了发射开销。在频繁发射的场景下,不需要考虑这个问题。

3.5. Conclusion

In this work, we use micro-benchmarks to analyze the launch overhead behaviours of different launch functions, in the case of both small kernels and large kernels. The result reveals two different kinds of kernel overheads and some unknown overhead only distinctive in the situation of a single kernel. The overhead of CPU launching kernel mainly has impacts in the situation of small kernels, while the execution overhead mainly has impacts in the situation of large kernels. We conclude that launching a new kernel is only profitable in the situation when the performance improvement surpasses the overhead of a new kernel. Additionally, we observed that Cooperative Multi-Device Launch is slightly slower than Cooperative Launch, and Cooperative Launch is slightly slower than Traditional Launch. This additional latency is trivial considering the benefit of using grid synchronization. This research is mainly focused on the V100 GPUs in DGX1. But we also observe similar behaviors in P100 platform.
结果揭示了两种不同的内核开销和一些仅在单个内核的情况下才出现的未知开销。CPU 启动内核的开销主要对 small kernels 的情况产生影响,而执行开销主要对 large kernels 的情况产生影响。我们得出结论,只有在性能改进超过新内核的开销的情况下,启动新内核才是有利可图的。此外,我们观察到 Cooperative Multi-Device Launch 比 Cooperative Launch 稍慢,Cooperative Launch 比 Traditional Launch 稍慢。考虑到使用 grid synchronization 的好处,这种额外的延迟是微不足道的。

4. REFERENCES

[1] CUDA C++ Programming Guide, https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
[2] Better Performance at Lower Occupancy, https://www.nvidia.com/content/gtc-2010/pdfs/2238_gtc2010.pdf
[1] Inter-block GPU communication via fast barrier synchronization, https://ieeexplore.ieee.org/document/5470477

Understanding the Overheads of Launching CUDA Kernels

https://www.hpcs.cs.tsukuba.ac.jp/icpp2019/data/posters/Poster17-moc.pdf

Lingqi Zhang 1 ^1 1, Mohamed Wahib 2 ^2 2, Satoshi Matsuoka 13 ^{1 3} 13
zhang.l.ai@m.titech.ac.jp, mohamed.attia@aist.go.jp, matsu@is.titech.ac.jp
1 ^1 1Tokyo Institute of Technology, Dept. of Mathematical and Computing Science, Tokyo, Japan
2 ^2 2AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory
3 ^3 3RIKEN Center for Computational Science, Hyogo, Japan

1. Motivation

  • Nvidia GPUs can run 10,000s of threads on independent SMs (Streaming Multi-processors)

Not ideal for device-wide barriers (不适合设备范围的 barriers)

  • Method for device-wide barriers in GPUs

Software barriers (example in [1])
Implicit barriers: launching separate kernels (impacts performance)

  • Alternative ways to achieve the same goal

Grid synchronization or multi-grid synchronization [2]
Higher performance might come from lower occupancy [3]

  • Implicit barrier (additional kernels) vs. single kernel
  • Question:

When not to launch an additional kernel?
What is the penalty of using different kinds of barriers in CUDA?

impact [ɪm'pækt]:v. 冲击,撞击,有作用 n. 撞击,冲击力,冲撞,巨大影响

2. Background

  • Different kinds of kernel launch methods.

Traditional Launch
Cooperative Launch (CUDA 9): Introduced to support grid synchronization
Cooperative Multi-Device Launch (CUDA 9): Introduced to support multi-grid synchronization

  • Sleep instruction: wait specific nanosecond in GPU kernel.

3. Micro-benchmark

  • Definition

Kernel Latency: Total latency to run kernels, start from CPU thread launching a thread, end at CPU thread noticing that the kernel is finished.
Kernel Overhead: Latency that is not related to kernel execution.
Additional Latency: Considering that CPU thread have just called a kernel launch function, additional latency is the additional latency to launch an additional kernel.
CPU Launch Overhead: Latency of CPU calling a launch function.
Small Kernel: Kernel execution time is not the main reason for additional latency.
Larger Kernel: Kernel execution time is the main reason for additional latency.

在这里插入图片描述
Figure 1: Sample code of micro-benchmark that call launch function 5 times, and repeats a wait unit (sleep 1000 ns) 10 times.

为了方便测试,本次实验中采用的 kernel 并没有指定参数。考虑到实际场景中 kernel 可能有多个参数,实际场景中启动开销应该大于本实验测试的结果。

  • Additional wait unit (sleep 1000 ns) do not increase any kernel overhead (Considering System Error)

在这里插入图片描述
Figure 2: Gradient of latency per wait unit (sleep 1000 ns) in a single kernel

  • Test overhead in small kernels

Method: Using null kernel (no code inside) to represent a Small Kernel
小 kernel 测试中 wait unit 循环次数为 0,即 kernel 中不进行任何操作。

  • Test overhead in large kernels

Method: Using kernel fusion to unveil the overhead.

unveil [ʌn'veɪl]:v. 推出,为 ... 揭幕,揭开 ... 上的覆盖物,拉开 ... 的帷幔

在这里插入图片描述
Figure 3: Using kernel fusion to test overhead hidden in kernel execution

4. Launch Overhead in Small Kernels

在这里插入图片描述
在这里插入图片描述
Figure 4: Comparison of null kernel overhead using three different launch functions that employ different types of barriers (left) , Cooperative Multi-Device Launch among different devices (right).

CPU Launch Overhead is the main overhead in Small Kernel.

5. Launch Overhead in Large Kernels

在这里插入图片描述
在这里插入图片描述
Figure 5: Comparison of Large Kernel Overhead among different launch functions (left), Cooperative Multi-Device Launch among different devices (right).

  • CPU launch overhead is recorded to prove that it is not distinctive here. (the result is not as precise as the one in “Small Kernel” section)
  • GPU execution overhead does exist.

6. Other Overheads

Empty kernel lasts about 8 us, still longer than the overheads we reported.

在这里插入图片描述
Figure 6: Comparison of different overheads in different launch functions

Other Overhead is distinctive in single kernel. (Larger than the two kinds of overhead we reported)

distinctive [dɪ'stɪŋktɪv]:adj. 独特的,特别的,有特色的

7. Conclusion

  • Main overheads:

Small Kernels: CPU Launch Overhead
Large Kernels: GPU Execution Overhead
Single Kernel: Other Overhead

  • Overhead of different launch functions

Cooperative Multi-Device Launch > Cooperative Launch > Traditional Launch

  • Launch a new kernel when the performance improvement surpasses the overhead of a new kernel. (当性能改进超过新内核的开销时,启动新内核。)

8. References

[1] Inter-block GPU communication via fast barrier synchronization, https://ieeexplore.ieee.org/document/5470477
[2] CUDA C++ Programming Guide, https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
[3] Better Performance at Lower Occupancy, https://www.nvidia.com/content/gtc-2010/pdfs/2238_gtc2010.pdf

References

[1] Yongqiang Cheng, https://yongqiang.blog.csdn.net/
[2] Understanding the Overheads of Launching CUDA Kernels, https://www.hpcs.cs.tsukuba.ac.jp/icpp2019/data/posters/Poster17-abst.pdf
[3] Understanding the Overheads of Launching CUDA Kernels, https://www.hpcs.cs.tsukuba.ac.jp/icpp2019/data/posters/Poster17-moc.pdf
[4] CUDA Runtime API, https://docs.nvidia.com/cuda/cuda-runtime-api/index.html

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2050863.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【Validation + i18n】✈️运行时参数校验+国际化上下文实现自定义参数校验规则

目录 👋前言 👀一、环境准备 📫二、代码实现 2.1 Validation 自定义验证类 2.2 自定义注解代码实现 💞️三、测试 🌱四、章末 👋前言 小伙伴们大家好,最近在和一位读者讨论国际化上下文工具…

SpringBoot-01-全局异常处理器

在之前的项目中每一个异常的地方都要进行处理&#xff0c;十分的麻烦。 在springBoot项目中&#xff0c;提供了全局的异常处理器&#xff0c;可能出现异常的地方直接抛出即可。 RestControllerAdvice public class GlobalException {ExceptionHandlerpublic Result<String…

Golang | Leetcode Golang题解之第342题4的幂

题目&#xff1a; 题解&#xff1a; func isPowerOfFour(n int) bool {return n > 0 && n&(n-1) 0 && n%3 1 }

【电路笔记】-桥接 T 型衰减器

桥接 T 型衰减器 文章目录 桥接 T 型衰减器1、概述2、桥接 T 型衰减器示例 13、可变桥接 T 型衰减器4、完全可调衰减器5、可切换桥接 T 型衰减器Bridged-T 衰减器是另一种电阻衰减器设计,它是标准对称 T 垫衰减器的变体。 1、概述 顾名思义,桥接 T 形衰减器具有一个额外的电…

Chapter 39 Python多线程编程

欢迎大家订阅【Python从入门到精通】专栏&#xff0c;一起探索Python的无限可能&#xff01; 文章目录 前言一、并行执行二、threading模块 前言 现代操作系统如 macOS、UNIX、Linux 和 Windows 等&#xff0c;均支持多任务处理。本篇文章详细讲解了并行执行的概念以及如何在 …

苍穹外卖-day03(SpringBoot+SSM的企业级Java项目实战)

苍穹外卖-day03 课程内容 公共字段自动填充 新增菜品 菜品分页查询 删除菜品 修改菜品 功能实现&#xff1a;菜品管理 菜品管理效果图&#xff1a; 1. 公共字段自动填充 1.1 问题分析 在上一章节我们已经完成了后台系统的员工管理功能和菜品分类功能的开发&#xff0c…

本地ComfyUI安装全记录

资料 先看我写的stable diffusion全记录 ComfyUI 完全入门&#xff1a;安装部署 ComfyUI 完全入门&#xff1a;图生视频 ComfyUI【强烈推荐】 秋葉aaaki comfy UI整合包 可以使用stable diffusion的大模型&#xff0c;通过修改文件重新指向 修改路径即可 下载秋叶大佬的…

Linux 实操-权限管理:深入了解rwx的作用

&#x1f600;前言 本篇博文是关于Linux文件权限管理的基本知识和实际操作&#xff0c;希望你能够喜欢 &#x1f3e0;个人主页&#xff1a;晨犀主页 &#x1f9d1;个人简介&#xff1a;大家好&#xff0c;我是晨犀&#xff0c;希望我的文章可以帮助到大家&#xff0c;您的满意是…

git rebase 重建清爽的历史提交

前言 在代码评审时遇到分支上有多个commit信息&#xff0c;对于评审者来说是非常头疼的&#xff0c;因为太混乱了。遇到这样的情况&#xff0c;就需要让开发人员把commit压缩一下&#xff0c;简单来说就是将多个commit合并为一个&#xff0c;这样看起来就比较整洁了&#xff0…

【颠覆传统!】SmartEDA引领潮流:在线实时仿真,Multisim与Proteus望尘莫及的新纪元!

在电子设计自动化的浩瀚星空中&#xff0c;两款老牌软件——Multisim与Proteus&#xff0c;如同璀璨星辰&#xff0c;长久以来照亮了工程师们的设计之路。它们以强大的仿真功能和丰富的元件库&#xff0c;赢得了无数设计者的青睐。然而&#xff0c;时代的车轮滚滚向前&#xff…

关于FreeRTOS使用相关API函数导致程序阻塞的问题

前言&#xff1a; 如题。近日在给项目移植FreeRTOS的时候&#xff0c;发现调用如下API函数会阻塞&#xff1a; xTaskNotifyGive(xTaskGetHandle(Task_PrintCtrl_attributes.name)); 首先猜测可能是xTaskGetHandle有问题导致。通过printf打印调试信息&#xff0c;发现执行xTask…

乐凡三防平板定制:为行业量身打造的移动解决方案

在数字化转型的大潮中&#xff0c;移动设备成为企业提升效率、优化流程的关键工具。三防平板&#xff0c;以其坚固耐用、适应恶劣环境的特性&#xff0c;成为工业、物流、建筑、军事等领域不可或缺的选择。而三防平板的定制化服务&#xff0c;则进一步满足了不同行业对设备性能…

Linux | Linux进程万字全解:内核原理、进程状态转换、优先级调度策略与环境变量

目录 1、从计算机组成原理到冯诺依曼架构 计算机系统的组成 冯诺依曼体系 思考&#xff1a;为什么计算机不能直接设计为 输入设备-CPU运算-输出设备 的结构&#xff1f; 2、操作系统(Operator System) 概念 设计OS的目的 描述和组织被管理对象 3、进程 基本概念 进程id和父进程…

亲测好用,吐血整理 ChatGPT 3.5/4.0 新手使用手册~

废话不多说&#xff0c;直接分享正文~ 以下是小编为大家搜集到的最新的ChatGPT国内站&#xff0c;各有优缺点。 1、AI Plus&#xff08;稳定使用&#xff09; 推荐指数&#xff1a;⭐⭐⭐⭐⭐ yixiaai.com 该网站已经稳定运营了1年多了。2023年3月份第一批上线的网…

linux网络配置脚本

通过脚本&#xff0c;设置静态ip以及主机名 因为企业9的网络配置文件和企业7的不一样所以&#xff0c;我们以rhel9和rhel7为例 rhel7/centos7/openeuler #!/bin/bash cat > /etc/sysconfig/network-scripts/ifcfg-$1 << EOF DEVICE$1 ONBOOTyes BOOTPROTOnone IPAD…

数据埋点系列 14|跨平台和多源数据整合:构建全面数据视图的策略与实践

在当今复杂的数字生态系统中&#xff0c;组织的数据通常分散在多个平台和来源中。有效整合这些数据不仅可以提供全面的业务洞察&#xff0c;还能支持更准确的决策制定。本文将探讨如何实现跨平台和多源数据的有效整合。 目录 1. 数据整合的重要性2. 数据整合的挑战3. 数据整合…

695. 岛屿的最大面积(中等)

695. 岛屿的最大面积 1. 题目描述2.详细题解3.代码实现3.1 Python3.2 Java 1. 题目描述 题目中转&#xff1a;695. 岛屿的最大面积 2.详细题解 该题是典型的深度优先搜索题&#xff0c;深度优先搜索的基本思想是&#xff1a;从某个节点出发&#xff0c;尽可能深地搜索图的分支…

Redis未授权访问漏洞利用合集

一、基本信息 靶机&#xff1a;IP:192.168.100.40 攻击机&#xff1a;IP:192.168.100.60 二、漏洞 & 过程 Redis 未授权访问漏洞利用无口令远程登录靶机 靶机 cd redis-4.0.8/src./redis-server ../redis.conf 攻击机 ./redis-cli -h 192.168.100.40 Redis 未授权访问…

删除微博博文js脚本实现

我当前的时间&#xff1a;2024.8.18 脚本可以直接使用&#xff0c;随着时间推移&#xff0c;微博页面元素可能会有变动。 思路&#xff1a;javascript 模拟手动点击&#xff0c;下滑&#xff0c;并且删除博文 首先登录微博&#xff0c;进入自己的博文界面如下&#xff1a; 进…

数据结构08--排序及查找

1.基本概念 排序是处理数据的一种最常见的操作&#xff0c;所谓排序就是将数据按某字段规律排列&#xff0c;所谓的字段就是数据节点的其中一个属性。比如一个班级的学生&#xff0c;其字段就有学号、姓名、班级、分数等等&#xff0c;我们既可以针对学号排序&#xff0c;也可以…