图像处理算法:白平衡、除法器、乘法器~笔记

news2024/12/28 5:10:22

参考:

基于FPGA的自动白平衡算法的实现        

白平衡初探 (qq.com)         

FPGA自动白平衡实现步骤详解-CSDN博客

xilinx 除法ip核(divider) 不同模式结果和资源对比(VHDL&ISE)_ise除法器ip核-CSDN博客 

 数字信号处理-04- FPGA常用运算模块-除法器(二)-阿里云开发者社区 (aliyun.com)

 【FPGA】:ip核--Divider(除法器)_除法器ip核-CSDN博客

 数字信号处理-04- FPGA常用运算模块-除法器_tlast-CSDN博客

目的:还原出真实的白色

色温的概念和示例:

涉及的资源:除法器、乘法器

除法器基本介绍

LUTMult
LUTMult . A simple lookup estimate of the reciprocal of the divisor followed by a
multiplier. Only remainder output type is supported because of the bias required in the
reciprocal estimate. This bias would introduce an offset (error) if used to create a
fractional output. This is recommended for operand widths less than or equal to 12 bits.
This implementation uses DSP slices, block RAM and a small amount of FPGA logic
primitives (registers and LUTs). For operand widths where either Radix2 or the LUTMult
options are possible, the LUTMult solution offers a solution using fewer FPGA logic
resources because of the use of DSP and block RAM primitives.
除数的倒数和乘数的简单查找估计。由于倒数估计中需要偏差,因此仅支持余数输出类型。如果用于创建小数输出,此偏差会引入偏移(误差)。对于小于或等于 12 位的操作数宽度,建议这样做。该实现使用 DSP Slice、Block RAM 和少量 FPGA 逻辑原语(寄存器和 LUT)。对于可以使用 Radix2 或 LUTMult 选项的操作数宽度,LUTMult 解决方案提供了一种使用更少 FPGA 逻辑资源的解决方案,因为使用了 DSP 和 Block RAM 原语。
Radix-2.
Radix-2. Radix-2 non-restoring integer division using integer operands, allowing either
a fractional or an integer remainder to be generated. This is recommended for operand
widths less than around 16 bits or for applications requiring high throughput. The
Send Feedback  implementation uses FPGA logic primitives (registers and LUTs). The Radix2 solution does not use DSP or block RAM primitives, so this implementation is recommended
when these primitives are needed elsewhere.
基数-2。使用整数操作数的 Radix-2 非恢复整数除法,允许生成小数或整数余数。对于小于 16 位左右的操作数宽度或需要高吞吐量的应用,建议这样做。发送反馈实现使用 FPGA 逻辑原语(寄存器和 LUT)。 Radix2 解决方案不使用 DSP 或块 RAM 原语,因此当其他地方需要这些原语时,建议使用此实现。
High Radix.
High Radix . High Radix division with prescaling. This is recommended for operand
widths greater than around 16 bits. This implementation uses DSP slices and block
RAMs.
带预缩放的高基数除法。对于大于 16 位左右的操作数宽度,建议这样做。该实现使用 DSP Slice 和 Block RAM。

延迟测算

The latency of the divider core is a function of the AXI4-Stream configuration parameters
and the latency of the algorithm selected. Latency is only a constant when the AXI4-Stream
mode is set to Non-Blocking and when the core algorithm and throughput are set such that
one sample is input per clock cycle. If the core is set to accept data only one in N cycles,
then data is only accepted on cycles N, 2N, 3N, …. It is not the case that data is accepted
immediately as long as N cycles have passed since the previous input. Hence, latency
appears to be increased if data is presented before the core is able to accept it. Another
effect which can cause latency to vary and increase is if full AXI4-Stream behavior is
selected. This is because a FIFO is used to manage data for this mode and the depth of the
FIFO adds to the latency. However, it should be noted that the intention of selecting
AXI4-Stream is to replace the need to balance latency with a handshake which manages
data flow at runtime, so latency should be less of a consideration. Because latency can vary
due to these effects, only minimum latency can be determined as a constant for any given
configuration of the core. In the following sections the latency of the algorithm alone is
discussed.
分频器核心的延迟是 AXI4-Stream 配置参数和所选算法延迟的函数。仅当 AXI4-Stream 模式设置为非阻塞且核心算法和吞吐量设置为每个时钟周期输入一个样本时,延迟才为常数。如果内核设置为仅在 N 个周期中接受一个数据,则仅在第 N、2N、3N、... 个周期上接受数据。只要自上一次输入以来经过≥N个周期,数据就不会被立即接受。因此,如果数据在核心能够接受之前呈现,则延迟似乎会增加。另一个可能导致延迟变化和增加的影响是选择完整的 AXI4-Stream 行为。这是因为 FIFO 用于管理此模式的数据,而 FIFO 的深度会增加延迟。然而,应该注意的是,选择 AXI4-Stream 的目的是用在运行时管理数据流的握手来取代平衡延迟的需要,因此延迟应该不被考虑。由于延迟可能会因这些影响而变化,因此对于任何给定的内核配置,只能将最小延迟确定为常数。在以下部分中,仅讨论算法的延迟。

LUTMult
The latency of the fully pipelined LUTMult is 8.
Radix-2
The latency (number of enabled clock cycles required before the core generates the first
valid output) for a fully pipelined divider is a function of the bit width of the dividend. If
fractional output is required, the fully pipelined latency is also a function of the fractional
bit width. In general:

Radix-2 全流水线除法器的延迟(内核生成第一个有效输出之前所需的启用时钟周期数)是被除数位宽的函数。如果需要小数输出,则完全流水线延迟也是小数位宽度的函数。一般来说:

• Fully pipelined latency is of the order M for integer remainder dividers, where M is the
width of the Quotient
• Fully pipelined latency is of the order M + F for fractional remainder dividers where F is
the width of the Fractional output
• 对于整数余数除法器,完全流水线延迟的数量级为 M,其中 M 是商的宽度
• 对于小数余数除法器,完全流水线延迟的数量级为 M + F,其中 F 是小数输出的宽度
Table 2-1 provides a list of the fully pipelined latency formula for divider selections. With
full pipelining, maximum possible performance is achieved. When clocks per division is 1,
latency can be set manually to a figure between 0 and the value shown in Table 2-1 . This
allows the latency of the core to be reduced at the expense of reducing the maximum clock
frequency at which the core can be clocked. Reducing the latency reduces the number of
registers used, but the LUT count remains approximately the same
表 2-1 提供了用于分频器选择的完全流水线延迟公式的列表。通过完整的流水线,可以实现最大可能的性能。当每分频时钟数为1时,延迟可以手动设置为0到表2-1所示值之间的数字。这允许减少内核的延迟,但代价是降低内核可以计时的最大时钟频率。减少延迟会减少使用的寄存器数量,但 LUT 数量保持大致相同
1. M = Dividend and Quotient Width, F = Fractional Width, A = total Latency of AXI interfaces.
1. M = 被除数和商宽度,F = 小数宽度,A = AXI 接口的总延迟。
High Radix Solution
Tables 2-2 and 2-3 show latency for the High Radix solution. To this, add 0 for NonBlocking
mode, 1 for Blocking mode with no output tready and 3 for Blocking mode with output
tready .
表 2-2 和 2-3 显示了高基数解决方案的延迟。为此,添加 0 表示非阻塞模式,添加 1 表示不带输出的阻塞模式,添加 3 表示带输出的阻塞模式。

除法器配置

Common Options

Describes parameters common to both implementations and allows the selection of the divider implementation.

描述两种实现方式共有的参数,并允许选择分频器实现方式。

• Algorithm Type: This selects between Radix-2, LUTMult and High Radix division solutions.

Radix-2 Options计算频次

Clocks Per Division: Determines the throughput of the Radix-2 solution (interval in clocks between inputs (or outputs)). A low value for this parameter results in high throughput, but also in greater resource use.

每分频时钟数:确定 Radix-2 解决方案的吞吐量(输入(或输出)之间的时钟间隔)。此参数的值较低会导致吞吐量较高,但也会导致资源使用量增加。

High Radix and LUTMult Options

Number of iterations (High Radix only): Read-only text field that reports the number of iterations performed by the High Radix engine for each divide. This sets the maximum throughput of the divider. To achieve this throughput, the operands must be supplied as soon as requested by the core s_axis_dividend_tready and s_axis_divisor_tready outputs.

迭代次数(仅限高基数):只读文本字段,报告高基数引擎为每次划分执行的迭代次数。这设置了分频器的最大吞吐量。为了实现此吞吐量,必须在核心 s_axis_dividend_tready 和 s_axis_divisor_tready 输出请求时立即提供操作数。

Throughput (High Radix only): Read-only text field that reports the maximum throughput that can be sustained by the divider when operands are supplied at a constant rate. In AXI blocking modes, throughput might be slightly higher due to Send Feedback Divider Generator v5.1 26 PG151 February 4, 2021 www.xilinx.com Chapter 4: Design Flow Steps buffering. This rate applies when FlowControl is set to NonBlocking and the output channel DOUT has no tready.

吞吐量(仅限高基数):只读文本字段,报告以恒定速率提供操作数时分频器可以维持的最大吞吐量。在 AXI 阻塞模式下,由于发送反馈分频器生成器设计流程步骤缓冲,吞吐量可能会稍高。当FlowControl 设置为NonBlocking 并且输出通道DOUT 没有tready 时,适用此速率。

Common Options

Detect Divide-by-Zero: Check box. Determines if the core has a DIVIDE_BY_ZERO field in the output tuser port (m_axis_dout_tuser) to signal when a division by zero has been performed

检测除零:复选框。确定内核的输出 tuser 端口 (m_axis_dout_tuser) 中是否有 DIVIDE_BY_ZERO 字段,以在执行除以零时发出信号

AXI4-Stream Options

Flow Control: Blocking or NonBlocking. This is more fully explained in AXI4-Stream Considerations in Chapter 3. NonBlocking mode provides an easier migration path from the previous version of the Divider Generator core. Blocking mode eases data flow management to/from other AXI4-Stream blocking mode cores at the expense of some additional resource and latency

流量控制:阻塞或非阻塞。第 3 章中的 AXI4-Stream 注意事项对此进行了更全面的解释。NonBlocking 模式提供了从先前版本的 Divider Generator 核心的更简单的迁移路径。阻塞模式简化了进出其他 AXI4-Stream 阻塞模式内核的数据流管理,但会带来一些额外的资源和延迟

Optimize Goal: This applies only to blocking mode. When ACLKEN is selected and Optimize Goal is set to Resources, performance might be reduced. See Resource Utilization in Chapter 2.

优化目标:这仅适用于阻塞模式。当选择 ACLKEN 并将优化目标设置为资源时,性能可能会降低。请参阅第 2 章中的资源利用。

• Output has TREADY: Selects whether the output channel has a tready signal. This is required to allow back pressure from downstream, for example, if connected to another AXI4-Stream Blocking core. Without tready, downstream circuitry cannot halt dataflow from the divider, but some resource is saved

• 输出有TREADY:选择输出通道是否有TREADY 信号。这是允许来自下游的背压所必需的,例如,如果连接到另一个 AXI4-Stream Blocking 核心。如果没有tready,下游电路无法停止来自分频器的数据流,但可以节省一些资源

Output TLAST Behavior: Selects the source of the output channel tlast signal. When neither or only one input channel has a tlast then the output tlast is not present or derives from the input tlast appropriately. When both input channels have tlast, the output channel tlast can derive from either alone, the logical OR of both inputs, or the logical AND of both inputs.

输出 TLAST 行为:选择输出通道 tlast 信号的源。当没有或只有一个输入通道具有 tlast 时,则输出 tlast 不存在或从输入 tlast 适当导出。当两个输入通道都有 tlast 时,输出通道 tlast 可以单独源自两个输入的逻辑“或”或两个输入的逻辑“与”。

Latency Options配置流水

Latency Configuration: Automatic (fully pipelined) or manual (determined by following field). Latency configuration for Radix2 solution is configurable only when clocks per division is set to 1. This is due to iterative feedback and hence non-optional registers when clocks per division is greater than 1.

延迟配置:自动(完全流水线)或手动(由以下字段确定)。仅当每分频时钟设置为 1 时,Radix2 解决方案的延迟配置才可配置。这是由于迭代反馈,因此当每分频时钟大于 1 时,寄存器是非可选的。

Latency: When Latency Configuration is set to Automatic, this field provides the latency from input to output in terms of clock enabled clock cycles. When Manual, this field is used to specify the latency required. When high performance (clock frequency) is not required, a lower value in this field can save resources

延迟:当延迟配置设置为自动时,此字段以时钟启用的时钟周期提供从输入到输出的延迟。当手动时,该字段用于指定所需的延迟。当不需要高性能(时钟频率)时,该字段的值较低可以节省资源

Control Signals

ACLKEN: Determines if the core has a clock enable input (ACLKEN).

ARESETn: Determines if the core has an active-Low synchronous clear input (ARESETn).

Note:

a. The signal ARESETn always takes priority over ACLKEN, that is, ARESETn takes effect regardless of the state of ACLKEN.

b. The signal ARESETn is active-Low.

c. The signal ARESETn should be held active for at least two clock cycles. This is because, for performance, ARESETn is internally registered before being fed to the reset port of primitives

除法器输出:

Output Channel

• Remainder Type: This selects between remainder types Fractional and Remainder presented on the FRACTIONAL field of the output tdata port (m_axis_dout_tdata). Fractional remainder type is the only option for High Radix.

• 余数类型:在输出tdata 端口(m_axis_dout_tdata) 的FRACTIONAL 字段上显示的余数类型Fractional 和Remainder 之间进行选择。小数余数类型是高基数的唯一选择。

• Fractional Width: If Fractional remainder type is selected, this determines the number of bits provided on the FRACTIONAL field of the output channel (m_axis_dout_tdata). When High Radix is selected, the total output width (quotient part plus fractional part) is limited to 82. The width of the quotient is equal to the width of the dividend and is set in the Dividend channel section. The width of the tuser port is the sum of the present input channel tuser fields plus one if divide_by_zero detect is active. See AXI4-Stream Considerations in Chapter 3 for the internal structure of the tuser port. This channel also has a tlast port if either of the input channels has a tlast port

• 小数类型:

如果选择小数类型,则这个选择将确定输出通道的小数字段上提供的位数(m_axis_dout_tdata)。

当选择 High Radix 时,总输出宽度(商部分加小数部分)限制为 82。商的宽度等于被除数的宽度,并在 Dividend Channel 部分中设置。如果divide_by_zero 检测处于活动状态,tuser 端口的宽度是当前输入通道tuser 字段的总和加一。有关 tuser 端口的内部结构,请参阅第 3 章中的 AXI4-Stream 注意事项。

如果任一输入通道有 tlast 端口,则该通道也有 tlast 端口

TDATA Structure for Output (DOUT) Channel The structure of m_axis_dout_tdata is more complex. This port contains both quotient and, if present, remainder or fractional outputs. When the remainder type is set to remainder, the two outputs are considered separate and so are byte-oriented before being concatenated to make the m_axis_dout_tdata signal. When remainder type is fractional, the fractional part is considered an extension of the quotient so these two fields are concatenated before being padded to the next byte boundary.

输出 (DOUT) 通道的 TDATA 结构 m_axis_dout_tdata 的结构更为复杂。该端口包含商以及余数或小数输出(如果存在)。

当余数类型设置为余数时,两个输出被视为独立的,因此在连接形成 m_axis_dout_tdata 信号之前是面向字节的。

当余数类型为小数时,小数部分被视为商的扩展,因此这两个字段在填充到下一个字节边界之前会被连接起来。

WITH REMAINDER:PAD+QUOTIENT+REMAINDER+PAD

WITH FRACTIONAL PART:PAD+QUOTIENT+FRACTIONAL

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1406271.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

关于C#中的HashSet<T>与List<T>

HashSet<T> 表示值的集合。这个集合的元素是无须列表&#xff0c;同时元素不能重复。由于这个集合基于散列值&#xff0c;不能通过数组下标访问。 List<T> 表示可通过索引访问的对象的强类型列表。内部是用数组保存数据&#xff0c;不是链表。元素可重复&#xf…

Istio-gateway

一. gateway 在 Kubernetes 环境中&#xff0c;Kubernetes Ingress用于配置需要在集群外部公开的服务。但是在 Istio 服务网格中&#xff0c;更好的方法是使用新的配置模型&#xff0c;即 Istio Gateway&#xff0c;Gateway 允许将 Istio 流量管理的功能应用于进入集群的流量&…

C#基础:用ClosedXML实现Excel写入

直接在控制台输出&#xff0c;确保安装了该第三方库 using ClosedXML.Excel;class DataSource {public int id { get; set; }public string name { get; set; } "";public string classes { get; set; } "";public int score { get; set; } } class Te…

深入浅出hdfs-hadoop基本介绍

一、Hadoop基本介绍 hadoop最开始是起源于Apache Nutch项目&#xff0c;这个是由Doug Cutting开发的开源网络搜索引擎&#xff0c;这个项目刚开始的目标是为了更好的做搜索引擎&#xff0c;后来Google 发表了三篇未来持续影响大数据领域的三架马车论文&#xff1a; Google Fil…

spring Cloud Stream 实战应用深度讲解

springCloudStream 简介 Spring Cloud Stream是一个框架&#xff0c;用于构建与共享消息传递系统连接的高度可扩展的事件驱动微服务。 该框架提供了一个灵活的编程模型&#xff0c;该模型建立在已经建立和熟悉的 Spring 习惯用语和最佳实践之上&#xff0c;包括对持久发布/订…

【Maven】-- 打包添加时间戳的两种方法

一、需求 在执行 mvn clean package -Dmaven.test.skiptrue 后&#xff0c;生成的 jar 包带有自定义系统时间。 二、实现 方法一&#xff1a;使用自带属性&#xff08;不推荐&#xff09; 使用系统时间戳&#xff0c;但有一个问题&#xff0c;就是默认使用 UTC0 的时区。举例…

什么叫特征分解?

特征分解&#xff08;Eigenvalue Decomposition&#xff09;是将一个方阵分解为特征向量和特征值的过程。对于一个 nn 的方阵A&#xff0c;其特征向量&#xff08;Eigenvector&#xff09;v 和特征值&#xff08;Eigenvalue&#xff09; λ 满足以下关系&#xff1a; 这可以写…

Python批量自动处理文件夹

假如在你的电脑硬盘某文件夹里有这样一堆不同格式的文件&#xff0c;看起来非常混乱&#xff0c;详情如图所示&#xff1a; 为了方便自己快速检索到文件&#xff0c;我们需要将这些文件按照不同格式分类整理到不同的文件夹中&#xff0c;应该怎么做呢&#xff1f;源码如下&…

v39.for循环while循环

1.循环引入&#xff08;loops&#xff09; 2.while loop 当括号中表达式expression返回真值时&#xff0c;代码块执行 。之后将再次检查expression &#xff0c;如果仍然返回真值&#xff0c;继续执行......直到expression返回假值。 3.for loop 有初始化、条件、递增/减 步骤。…

Zookeeper集群 + Kafka集群,Filebeat+Kafka+ELK

目录 什么是Zookeeper&#xff1f; Zookeeper 工作机制 Zookeeper 特点 Zookeeper 数据结构 Zookeeper 选举机制 实验 部署 Zookeeper 集群 1.安装前准备 安装 JDK 下载安装包 2.安装 Zookeeper 修改配置文件 拷贝配置好的 Zookeeper 配置文件到其他机器上 在每个节…

uni-app (安卓、微信小程序)接口封装 token失效自动获取新的token

一、文件路径截图 1、新建一个文件app.js存放接口 //这里存放你需要的接口import {request} from /utils/request.js //这个文件是请求逻辑处理 module.exports {// 登录 -- 注册perssonRegister: (data) > { // 供应商注册 return request({url: manageWx/Login/perssonR…

npm i 报一堆版本问题

1&#xff0c;先npm cache clean --force 再下载 插件后缀加上 --legacy-peer-deps 2&#xff0c; npm ERR! code CERT_HAS_EXPIRED npm ERR! errno CERT_HAS_EXPIRED npm ERR! request to https://registry.npm.taobao.org/yorkie/download/yorkie-2.0.0.tgz failed, reason…

【echarts图表】提示暂无数据

【echarts图表】提示暂无数据 背景&#xff1a;echarts图表数据有时候为空数组[ ]&#xff0c;这时候渲染图表异常&#xff0c;需要提示暂无数据 // 提示 暂无数据 : noDataBox 样式 this.$nextTick(() > {const dom document.getElementById("echartsId");dom…

JAVA的面试题四

1.电商行业特点 &#xff08;1&#xff09;分布式&#xff1a; ①垂直拆分:根据功能模块进行拆分 ②水平拆分:根据业务层级进行拆分 &#xff08;2&#xff09;高并发&#xff1a; 用户单位时间内访问服务器数量,是电商行业中面临的主要问题 &#xff08;3&#xff09;集群&…

响应式Web开发项目教程(HTML5+CSS3+Bootstrap)第2版 例4-9 HTML5 表单验证

代码 <!doctype html> <html> <head> <meta charset"utf-8"> <title>HTML5 表单验证</title> </head><body> <form action"#" method"get">请输入您的邮箱:<input type"email&q…

如何基于 ESP32 芯片测试 WiFi 连接距离、获取连接的 AP 信号强度(RSSI)以及 WiFi吞吐测试

测试说明&#xff1a; 测试 WiFi 连接距离&#xff0c;是将 ESP32 作为 WiFi Station 模式来连接路由器&#xff0c;通过在开阔环境下进行拉距来测试。另外&#xff0c;可以通过增大 WiFi TX Power 来增大连接距离。 获取连接的 AP 信号强度&#xff0c;一般可以通过 WiFi 扫描…

matlab appdesigner系列-常用18-表格

表格&#xff0c;常用来导入外部表格数据 示例&#xff1a; 导入外界excel数据&#xff1a;data.xlsx 姓名年龄城市王一18长沙王二21上海王三56武汉王四47北京王五88成都王六23长春 操作步骤如下&#xff1a; 1&#xff09;将表格拖拽到画布上 2&#xff09;对app1右键进行…

【深度学习:集中偏差】减少计算机视觉数据集中偏差的 5 种方法

【深度学习&#xff1a;集中偏差】减少计算机视觉数据集中偏差的 5 种方法 有偏差的计算机视觉数据集会导致哪些问题&#xff1f;如何减少计算机视觉数据集中偏差的示例观察并监控带注释样本的类别分布确保数据集代表模型适用的人群明确定义对象分类、标记和注释的流程为标签质…

【书生·浦语】大模型实战营——第六课笔记

视频链接&#xff1a;https://www.bilibili.com/video/BV1Gg4y1U7uc/?vd_source5d94ee72ede352cb2dfc19e4694f7622 教程文档&#xff1a;https://github.com/InternLM/tutorial/blob/main/opencompass/opencompass_tutorial.md 仓库&#xff1a;https://github.com/open-compa…

(学习日记)2024.01.23:结构体、位操作和枚举类型

写在前面&#xff1a; 由于时间的不足与学习的碎片化&#xff0c;写博客变得有些奢侈。 但是对于记录学习&#xff08;忘了以后能快速复习&#xff09;的渴望一天天变得强烈。 既然如此 不如以天为单位&#xff0c;以时间为顺序&#xff0c;仅仅将博客当做一个知识学习的目录&a…