✅【文献串读】Object Counting论文串读

news2024/9/21 11:01:22

get宝藏博主:Tags - 郑之杰的个人网站 (0809zheng.github.io)  

目标计数(Object Counting) - 郑之杰的个人网站 (0809zheng.github.io)

目录

1.《CountGD: Multi-Modal Open-World Counting》

2.(2024CVPR)《DAVE – A Detect-and-Verify Paradigm for Low-Shot Counting 》

✔️3.(2024AAAI)Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting

✔️4.《Learning Spatial Similarity Distribution for Few-shot Object Counting》

⭐️ 5.LOCA

✔️6.《Semantic Generative Augmentations for Few-Shot Counting》

✔️7.CounTR: Transformer-based Generalised Visual Counting

8.《Semantic Generative Augmentations for Few-Shot Counting》

9.《Scale-Prior Deformable Convolution for Exemplar-Guided Class-Agnostic Counting》

10《Few-shot Object Counting with Similarity-Aware Feature Enhancement》


1.《CountGD: Multi-Modal Open-World Counting》

paper:2407.04619v1 (arxiv.org)

code:CountGD: Multi-Modal Open-World Counting (ox.ac.uk)

niki-amini-naieni/CountGD: Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting. (github.com)

本文的目标是提高图像中开放词汇表对象计数的通用性和准确性。为了提高通用性,我们重新利用了一个开放词汇表检测基础模型(GroundingDINO)进行计数任务,并通过引入模块使其能够通过视觉样本指定要计数的目标对象。反过来,这些新的能力——能够通过多模态(文本和样本)指定目标对象——导致计数准确性的提高。 我们做出了三项贡献:首先,我们介绍了第一个开放世界计数模型COUNTGD,其中提示可以通过文本描述或视觉样本或两者来指定;其次,我们展示了模型的性能在多个计数基准测试上显著提高了现有技术水平——当仅使用文本时,COUNTGD与所有以前的仅文本作品相当或更优,当同时使用文本和视觉样本时,我们超越了所有以前的模型;第三,我们对文本和视觉样本提示之间的不同交互进行了初步研究,包括它们相互加强的情况以及一个限制另一个的情况。代码和测试模型的应用程序可获取。

图1:CoUNTGD能够同时使用视觉样本和文本提示生成高度准确的对象计数(a),但也无缝支持仅使用文本查询或仅视觉样本进行计数(b)。多模态视觉样本和文本查询为开放世界计数任务带来额外的灵活性,例如使用一个短语(c),或添加额外的约束(“左”或“右”这些词)来选择对象的子集(d)。这些示例取自FSC-147[39]和CountBench[36]测试集。视觉样本显示为黄色框。(d)展示了模型预测的置信度图,其中颜色强度高表示置信度高。

In summary, we make the following three contributions: First, we introduce COUNTGD, the first openworld object counting model that accepts either text or visual exemplars or both simultaneously, in a single-stage architecture; Second, we evaluate the model on multiple standard counting benchmarks, including FSC-147 [39], CARPK [18] and CountBench [36], and show that COUNTGD significantly improves on the state-of-the-art performance by specifying the target object using both exemplars and text. It also meets or improves on the state-of-the-art for text-only approaches when trained and evaluated using text-only; Third, we investigate how the text can be used to refine the visual information provided by the exemplar, for example by filtering on color or relative position in the image, to specify a sub-set of the objects to count. In addition we make two minor improvements to the inference stage: one that addresses the problem of double counting due to self-similarity, and the other to handle the problem of a very high count.

总结来说,我们做出了以下三项贡献:首先,我们介绍了COUNTGD,这是第一个开放世界对象计数模型,它可以接受文本或视觉样本或同时接受两者,在单阶段架构中;其次,我们在多个标准计数基准上评估了模型,包括FSC-147[39]、CARPK[18]和CountBench[36],并表明COUNTGD通过使用样本和文本指定目标对象显著提高了现有技术水平。当使用文本进行训练和评估时,它也满足或提高了仅文本方法的现有技术水平;第三,我们研究了如何使用文本来细化样本提供的视觉信息,例如通过按颜色或图像中的相对位置过滤,来指定要计数的对象子集。此外,我们对推理阶段进行了两个小改进:一个解决了由于自相似性导致的重复计数问题,另一个用于处理非常高计数的问题。

2.(2024CVPR)《DAVE – A Detect-and-Verify Paradigm for Low-Shot Counting 》

论文:2404.16622v1 (arxiv.org)

code:jerpelhan/DAVE (github.com)

解读:

DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting - 郑之杰的个人网站 (0809zheng.github.io)

Abstract

✔️3.(2024AAAI)Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting

paper:2305.04440v2 (arxiv.org)

code:Xu3XiWang/CACViT-AAAI24: Official implementation of Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting (github.com)

解读: 

Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting - 郑之杰的个人网站 (0809zheng.github.io)

Class-agnostic counting (CAC) aims to count objects of interest from a query image given few exemplars. This task is typically addressed by extracting the features of query image and exemplars respectively and then matching their feature similarity, leading to an extract-then-match paradigm. In this work, we show that CAC can be simplified in an extractand-match manner, particularly using a vision transformer (ViT) where feature extraction and similarity matching are executed simultaneously within the self-attention. We reveal the rationale of such simplification from a decoupled view of the self-attention. The resulting model, termed CACViT, simplifies the CAC pipeline into a single pretrained plain ViT. Further, to compensate the loss of the scale and the orderof-magnitude information due to resizing and normalization in plain ViT, we present two effective strategies for scale and magnitude embedding. Extensive experiments on the FSC147 and the CARPK datasets show that CACViT significantly outperforms state-of-the-art CAC approaches in both effectiveness (23.60% error reduction) and generalization, which suggests CACViT provides a concise and strong baseline for CAC. Code will be available.

类别无关计数(CAC)的目标是在只有少量样本的情况下,从一个查询图像中计算出感兴趣的对象数量。这个任务通常通过分别提取查询图像和样本的特征,然后匹配它们的相似性特征来解决,导致了一个先提取后匹配的范式。在这项工作中,我们展示了如何以提取和匹配的方式简化CAC,特别是使用视觉变换器(ViT),在自注意力中同时执行特征提取和相似性匹配。我们从自注意力的解耦视角揭示了这种简化的合理性。所得到的模型,称为CACViT,将CAC流程简化为单个预训练的纯ViT。 此外,为了弥补因缩放和标准化而在纯ViT中丢失的尺度和数量级信息,我们提出了两种有效的尺度和数量级嵌入策略。在FSC147和CARPK数据集上的广泛实验表明,CACViT在有效性(23.60%的错误降低)和泛化能力上显著超越了现有的CAC方法,这表明CACViT为CAC提供了一个简洁而强大的基线。代码将提供。

In a nutshell, our contributions are three-fold:

• A novel extract-and-match paradigm: we show that simultaneous feature extraction and matching can be made possible in CAC;

• CACViT: a simple and strong ViT-based baseline, sets the new state-of-the-art on the FSC-147 benchmark;

• We introduce two effective strategies to embed scale, aspect ratio, and order of magnitude information tailored to CACViT.

简而言之,我们的贡献是三方面的:

• 一种新颖的提取和匹配范式:我们展示了在CAC中可以同时进行特征提取和匹配的可能性;

• CACViT:一个简单而强大的基于ViT的基线,在FSC-147基准测试上设立了新的最先进水平;

• 我们引入了两种有效的策略来嵌入尺度、纵横比和数量级信息,这些策略专为CACViT量身定制。

✔️4.《Learning Spatial Similarity Distribution for Few-shot Object Counting》

paper: 2405.11770v1 (arxiv.org)

code:CBalance/SSD: SSD: Learning Spatial Similarity Distribution for Few-shot Object Counting (github.com)

解读:

Learning Spatial Similarity Distribution for Few-shot Object Counting - 郑之杰的个人网站 (0809zheng.github.io)

Few-shot object counting aims to count the number of objects in a query image that belong to the same class as the given exemplar images. Existing methods compute the similarity between the query image and exemplars in the 2D spatial domain and perform regression to obtain the counting number. However, these methods overlook the rich information about the spatial distribution of similarity on the exemplar images, leading to significant impact on matching accuracy. To address this issue, we propose a network learning Spatial Similarity Distribution (SSD) for few-shot object counting, which preserves the spatial structure of exemplar features and calculates a 4D similarity pyramid point-to-point between the query features and exemplar features, capturing the complete distribution information for each point in the 4D similarity space. We propose a Similarity Learning Module (SLM) which applies the efficient center-pivot 4D convolutions on the similarity pyramid to map different similarity distributions to distinct predicted density values, thereby obtaining accurate count. Furthermore, we also introduce a Feature Cross Enhancement (FCE) module that enhances query and exemplar features mutually to improve the accuracy of feature matching. Our approach outperforms state-of-the-art methods on multiple datasets, including FSC-147 and CARPK. Code is available at https://github.com/CBalance/SSD.

Our contributions can be summarized as follows:

• We design a model based on learning the 4D spatial similarity distribution between query and exemplar features in Similarity Learning Module (SLM). This model is capable of obtaining accurate counting results after comprehensive integration of similarity distribution information among point pairs and their surroundings.

• Before calculating the similarity between query and exemplar features, we introduce a Feature Cross Enhancement (FCE) module, which enhances the interaction between them, reducing the distance between the target objects and exemplar features to achieve better matching performance.

• Extensive experiments on large-scale counting benchmarks, such as FSC-147 and CARPK, are conducted and the results demonstrate that our method outperforms the state-of-the-art approaches

⭐️ 5.LOCA

paper   https://arxiv.org/pdf/2211.08217v2.pdf

code:  djukicn/loca: LOCA - A Low-Shot Object Counting Network With Iterative Prototype Adaptation (ICCV 2023) (github.com)

解读:A Low-Shot Object Counting Network With Iterative Prototype Adaptation - 郑之杰的个人网站 (0809zheng.github.io)

✅【文献阅读】(23ICCV)LOCA-CSDN博客

✔️6.《Semantic Generative Augmentations for Few-Shot Counting》

paper:2311.16122v1 (arxiv.org)

code:perladoubinsky/SemAug: [WAVC 2024] Official implementation of the paper: Semantic Generative Augmentations for Few-shot Counting (github.com)

With the availability of powerful text-to-image diffusion models, recent works have explored the use of synthetic data to improve image classification performances. These works show that it can effectively augment or even replace real data. In this work, we investigate how synthetic data can benefit few-shot class-agnostic counting. This requires to generate images that correspond to a given input number of objects. However, text-to-image models struggle to grasp the notion of count. We propose to rely on a double conditioning of Stable Diffusion with both a prompt and a density map in order to augment a training dataset for few-shot counting. Due to the small dataset size, the fine-tuned model tends to generate images close to the training images. We propose to enhance the diversity of synthesized images by exchanging captions between images thus creating unseen configurations of object types and spatial layout. Our experiments show that our diversified generation strategy significantly improves the counting accuracy of two recent and performing few-shot counting models on FSC147 and CARPK

To tackle few-shot counting, we propose to synthesize unseen data with Stable Diffusion conditioned by both a textual prompt and a density map. We thus build an augmented FSC dataset that is used to train a deep counting network. The double conditioning, implemented with ControlNet [42], allows us to generate novel synthetic images with a precise control, preserving the ground truth for the counting task. It deals well with large numbers of objects, while current methods fail in such cases [19, 27]. To increase the diversity of the augmented training set, we swap image descriptions between the n available training samples, leading to n(n−1) 2 novel couples, each being the source of several possible synthetic images. However, we show that some combinations do not make sense and lead to poor quality samples. Therefore, we only select plausible pairs, resulting in improved augmentation quality. We evaluate our approach on two class-agnostic counting networks, namely SAFECount [41] and CounTR [6]. We show that it significantly improves the performances on the benchmark dataset FSC147 [28] and allow for a better generalization on the CARPK dataset [14].

✔️7.CounTR: Transformer-based Generalised Visual Counting

paper:

2208.13721v3 (arxiv.org)

code:Verg-Avesta/CounTR: CounTR: Transformer-based Generalised Visual Counting (github.com)

解读: CounTR: Transformer-based Generalised Visual Counting - 郑之杰的个人网站 (0809zheng.github.io)

To summarise, in this paper, we make four contributions: First, we introduce an architecture for generalised visual object counting based on transformer, termed as CounTR (pronounced as counter). It exploits the attention mechanisms to explicitly capture the similarity between image patches, or with the few-shot instance “exemplars” provided by the end user; Second, we adopt a two-stage training regime (self-supervised pre-training, followed by supervised fine-tuning) and show its effectiveness for the task of visual counting; Third, we propose a simple yet scalable pipeline for synthesizing training images with a large number of instances, and demonstrate that it can significantly improve the performance on images containing a large number of object instances; Fourth, we conduct thorough ablation studies on the large-scale counting benchmark, e.g. FSC-147 [24], and demonstrate state-of-the-art performance on both zero-shot and few-shot settings, improving the previous best approach by over 18.3% on the mean absolute error of the test set.

总结来说,在本文中,我们做出了四项贡献:

首先,我们引入了一种基于变换器的广义视觉对象计数架构,称为CounTR(发音为counter)。它利用注意力机制明确捕获图像块之间的相似性,或与最终用户提供的少样本实例“样本”之间的相似性;

其次,我们采用两阶段训练机制(自监督预训练,后跟监督微调),并展示了其对视觉计数任务的有效性;

第三,我们提出了一个简单但可扩展的管道,用于合成具有大量实例的训练图像,并证明它可以显著提高在包含大量对象实例的图像上的性能;

第四,我们在大规模计数基准测试上进行了彻底的消融研究,例如FSC-147[24],并在零样本和少样本设置上展示了最先进性能,在测试集的平均绝对误差上比之前最好的方法提高了超过18.3%。

8.《Semantic Generative Augmentations for Few-Shot Counting》

paper: https://arxiv.org/pdf/2311.16122v1

code: perladoubinsky/SemAug: [WAVC 2024] Official implementation of the paper: Semantic Generative Augmentations for Few-shot Counting (github.com)

解读:

With the availability of powerful text-to-image diffusion models, recent works have explored the use of synthetic data to improve image classification performances. These works show that it can effectively augment or even replace real data. In this work, we investigate how synthetic data can benefit few-shot class-agnostic counting. This requires to generate images that correspond to a given input number of objects. However, text-to-image models struggle to grasp the notion of count. We propose to rely on a double conditioning of Stable Diffusion with both a prompt and a density map in order to augment a training dataset for few-shot counting. Due to the small dataset size, the fine-tuned model tends to generate images close to the training images. We propose to enhance the diversity of synthesized images by exchanging captions between images thus creating unseen configurations of object types and spatial layout. Our experiments show that our diversified generation strategy significantly improves the counting accuracy of two recent and performing few-shot counting models on FSC147 and CARPK.

随着强大的文本到图像扩散模型的出现,最近的研究探索了使用合成数据来提高图像分类性能。这些研究表明,它可以有效地增强甚至替代真实数据。在这项工作中,我们研究了合成数据如何能够使少样本类别无关计数受益。这需要生成与给定输入对象数量相对应的图像。然而,文本到图像模型难以把握计数的概念。我们提议依靠Stable Diffusion的双重条件作用,既使用提示也使用密度图,以增强少样本计数的训练数据集。由于数据集规模小,微调模型倾向于生成接近训练图像的图像。我们提出通过交换图像之间的标题来增强合成图像的多样性,从而创建未见过的物体类型和空间布局的配置。我们的实验表明,我们的多样化生成策略显著提高了FSC147和CARPK上两种近期表现良好的少样本计数模型的计数准确性。

To tackle few-shot counting, we propose to synthesize unseen data with Stable Diffusion conditioned by both a textual prompt and a density map. We thus build an augmented FSC dataset that is used to train a deep counting network. The double conditioning, implemented with ControlNet [42], allows us to generate novel synthetic images with a precise control, preserving the ground truth for the counting task. It deals well with large numbers of objects, while current methods fail in such cases [19, 27]. To increase the diversity of the augmented training set, we swap image descriptions between the n available training samples, leading to n(n−1) 2 novel couples, each being the source of several possible synthetic images. However, we show that some combinations do not make sense and lead to poor quality samples. Therefore, we only select plausible pairs, resulting in improved augmentation quality. We evaluate our approach on two class-agnostic counting networks, namely SAFECount [41] and CounTR [6]. We show that it significantly improves the performances on the benchmark dataset FSC147 [28] and allow for a better generalization on the CARPK dataset [14].

为了解决少样本计数问题,我们提议使用由文本提示和密度图双重条件化的Stable Diffusion来合成未见数据。因此,我们构建了一个增强的FSC数据集,用于训练深度计数网络。双重条件化,通过ControlNet[42]实现,使我们能够生成具有精确控制的新颖合成图像,同时保留计数任务的真实情况。它很好地处理了大量对象的情况,而当前方法在这种情况下会失败[19, 27]。为了增加增强训练集的多样性,我们在n个可用训练样本之间交换图像描述,从而产生n(n-1)/2个新颖的组合,每个组合都是多个可能合成图像的来源。然而,我们表明有些组合没有意义,并导致质量较差的样本。因此,我们只选择合理的对,从而提高了增强的质量。我们在两个类别无关计数网络上评估了我们的方法,即SAFECount[41]和CounTR[6]。我们展示了它在基准数据集FSC147[28]上显著提高了性能,并在CARPK数据集[14]上允许更好的泛化。

9.《Scale-Prior Deformable Convolution for Exemplar-Guided Class-Agnostic Counting》

paper:0313.pdf (mpg.de)

code:Elin24/SPDCN-CAC: BMVC-2022 paper "Scale-Prior Deformable Convolution for Class-Agnostic Counting"(https://bmvc2022.mpi-inf.mpg.de/313) (github.com)

 解读:

Scale-Prior Deformable Convolution for Exemplar-Guided Class-Agnostic Counting - 郑之杰的个人网站 (0809zheng.github.io)

Class-agnostic counting has recently emerged as a more practical counting task, which aims to predict the number and distribution of any exemplar objects, instead of counting specific categories like pedestrians or cars. However, recent methods are developed by designing suitable similarity matching rules between exemplars and query images, but ignoring the robustness of extracted features. To address this issue, we propose a scale-prior deformable convolution by integrating exemplars' information, \eg, scale, into the counting network backbone. As a result, the proposed counting network can extract semantic features of objects similar to the given exemplars and effectively filter irrelevant backgrounds. Besides, we find that traditional L2 and generalized loss are not suitable for class-agnostic counting due to the variety of object scales in different samples. Here we propose a scale-sensitive generalized loss to tackle this problem. It can adjust the cost function formulation according to the given exemplars, making the difference between prediction and ground truth more prominent. Extensive experiments show that our model obtains remarkable improvement and achieves state-of-the-art performance on a public class-agnostic counting benchmark. the source code is available at https://github.com/Elin24/SPDCN-CAC.

To summarize, the key contributions of this paper are:

• To address class-agnostic counting, we propose a scale-prior deformable network to better extract exemplar-related features, followed by a segmentation-then-counting stage to count objects.

• We propose a scale-sensitive generalized loss to make the model training adaptive to objects of different sizes, boosting the performance and generalization of trained models.

• Extensive experiments and visualizations demonstrate these two designs work well, and outstanding performance is obtained when our model is tested on benchmarks.

总结来说,本文的主要贡献是:

为了解决类别无关计数问题,我们提出了一种尺度先验可变形网络,以更好地提取与样本相关的特征,然后通过分割-然后计数阶段来计数对象。

我们提出了一种尺度敏感的广义损失,使模型训练能够适应不同大小的对象,提高训练模型的性能和泛化能力。

• 通过广泛的实验和可视化,我们证明了这两种设计的有效性,并且在基准测试中,我们的模型表现出色。

10《Few-shot Object Counting with Similarity-Aware Feature Enhancement》

paper:2201.08959v5 (arxiv.org)

code:zhiyuanyou/SAFECount: [WACV 2023] Few-shot Object Counting with Similarity-Aware Feature Enhancement (github.com)

解读:

Few-shot Object Counting with Similarity-Aware Feature Enhancement - 郑之杰的个人网站 (0809zheng.github.io)

This work studies the problem of few-shot object counting, which counts the number of exemplar objects (i.e., described by one or several support images) occurring in the query image. The major challenge lies in that the target objects can be densely packed in the query image, making it hard to recognize every single one. To tackle the obstacle, we propose a novel learning block, equipped with a similarity comparison module and a feature enhancement module. Concretely, given a support image and a query image, we first derive a score map by comparing their projected features at every spatial position. The score maps regarding all support images are collected together and normalized across both the exemplar dimension and the spatial dimensions, producing a reliable similarity map. We then enhance the query feature with the support features by employing the developed point-wise similarities as the weighting coefficients. Such a design encourages the model to inspect the query image by focusing more on the regions akin to the support images, leading to much clearer boundaries between different objects. Extensive experiments on various benchmarks and training setups suggest that we surpass the state-of-the-art methods by a sufficiently large margin. For instance, on a recent large-scale FSC-147 dataset, we surpass the state-of-the-art method by improving the mean absolute error from 22.08 to 14.32 (35%↑). Code has been released。

这项研究探讨了少样本对象计数问题,即计算在查询图像中出现的样本对象(即由一个或多个支持图像描述)的数量。主要挑战在于目标对象可能在查询图像中密集堆积,使得很难识别每一个单独的对象。为了解决这个障碍,我们提出了一个新的学习块,配备了一个相似性比较模块和一个特征增强模块。具体来说,给定一个支持图像和一个查询图像,我们首先通过比较它们在每个空间位置上投影的特征来派生出一个得分图。关于所有支持图像的得分图被收集在一起,并在样本维度和空间维度上进行归一化,生成一个可靠的相似性图。然后我们通过使用开发的逐点相似性作为权重系数来增强查询特征与支持特征。这样的设计鼓励模型通过更多地关注类似于支持图像的区域来检查查询图像,导致不同对象之间的边界更加清晰。在各种基准和训练设置上的广泛实验表明,我们以足够的优势超越了最先进的方法。例如,在最近的大规模FSC-147数据集上,我们通过将平均绝对误差从22.08提高到14.32(提高了35%),超越了最先进的方法。代码已经发布。

In this work, we propose a Similarity-Aware Feature Enhancement block for object Counting (SAFECount). As discussed above, feature is more informative while similarity better captures the support-query relationship. Our novel block adequately integrates both of the advantages by exploiting similarity as a guidance to enhance the features for regression. Intuitively, the enhanced feature not only carries the rich semantics extracted from the image, but also gets aware of which regions within the query image are similar to the exemplar object. Specifically, we come up with a similarity comparison module (SCM) and a feature enhancement module (FEM), as illustrated in Fig. 2c. On one hand, different from the naive feature comparison in Fig. 2b, our SCM learns a feature projection, then performs a comparison on the projected features to derive a score map. This design helps select from features the information that is most appropriate for object counting. After the comparison, we derive a reliable similarity map by collecting the score maps with respect to all support images (i.e., few-shot) and normalizing them along both the exemplar dimension and the spatial dimensions. On the other hand, the FEM takes the point-wise similarities as the weighting coefficients, and fuses the support features into the query feature. Such a fusion is able to make the enhanced query feature focus more on the regions akin to the exemplar object defined by support images, facilitating more precise counting.

在这项工作中,我们提出了一个用于对象计数的相似性感知特征增强块(SAFECount)。如上所述,特征在相似性方面更具信息性,同时相似性更好地捕获了支持查询关系。我们的新颖块通过利用相似性作为指导来增强回归特征,充分整合了两者的优势。直观地说,增强的特征不仅携带了从图像中提取的丰富语义,而且还意识到查询图像中哪些区域与样本对象相似。具体来说,我们提出了一个相似性比较模块(SCM)和一个特征增强模块(FEM),如图2c所示。一方面,与图2b中的简单特征比较不同,我们的SCM学习了一个特征投影,然后在投影特征上执行比较以派生得分图。这种设计有助于从特征中选择最适合对象计数的信息。比较之后,我们通过收集所有支持图像(即少样本)的得分图,并沿着样本维度和空间维度对它们进行归一化,从而得到一个可靠的相似性图。另一方面,FEM将逐点相似性作为权重系数,并将支持特征融合到查询特征中。这样的融合能够使增强的查询特征更多地关注支持图像定义的样本对象相似的区域,从而促进更精确的计数。

Experimental results on a very recent large-scale FSC dataset, FSC-147 [24], and a car counting dataset, CARPK [10], demonstrate our substantial improvement over state-of-the-art methods. Through visualizing the intermediate similarity map and the final predicted density map, we find that our SAFECount substantially benefits from the clear boundaries learned between objects, even when they are densely packed in the query image.

在最近一个非常大规模的FSC数据集FSC-147[24]和一个汽车计数数据集CARPK[10]上的实验结果表明,我们的方法在性能上大大超越了现有的最先进方法。通过可视化中间相似性图和最终预测的密度图,我们发现我们的SAFECount从查询图像中清晰学习到的对象边界中获益匪浅,即使是在查询图像中密集堆积的对象也是如此。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1982922.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

报表控件stimulsoft操作:使用 Stimulsoft 产品连接到 OData 源

Stimulsoft Ultimate (原Stimulsoft Reports.Ultimate)是用于创建报表和仪表板的通用工具集。该产品包括用于WinForms、ASP.NET、.NET Core、JavaScript、WPF、PHP、Java和其他环境的完整工具集。无需比较产品功能,Stimulsoft Ultimate包含了…

FFmpeg实战 - 解复用与解码

大纲目录 文章目录 前置知识音视频基础概念解复用、解码的流程分析FFMPEG有8个常用库 常见音视频格式的介绍aac格式介绍(ADTS)h264格式分析FLV和MP4格式介绍 FFmpeg解码解封装实战数据包和数据帧(AVPacket/AVFrame)AVPacket/AVFra…

VHDX 安装操作系统

前言 使用 Win11 作为主力系统,再通过 VHDX 虚拟硬盘来安装另外的 Windows 系统。使用 VHDX 安装系统的好处在于:不影响原系统,用完即删。 需求 安装双系统,使用 VHDX 安装 WinServer 2022。 操作步骤 创建 VHDX 打开磁盘管…

一道笔试题 - 无重复字符的最长子串

老生常谈的一道题,常见并 文章目录 描述预期结果Java代码 描述 给定一个字符串 s ,请你找出其中不含有重复字符的最长子串的长度。 预期结果 Java代码 import java.util.HashSet; import java.util.Set;public class Demo2 {public static void main(S…

为什么回测效果非常好的策略实盘却不行?

这是一个絮絮叨叨的专题系列,跟大伙儿唠一唠量化相关的小问题,有感而发写到哪算哪,这是第二期,来唠个12块钱的~ 之前在某乎看到这个问题,说的是自己的MACD策略回测绩效不错,但实盘比较拉胯,希望…

pxe666666

1.下载图形化工具 2.init 5进入 3.配个ip 4.安装图形化生成kickstart自动安装脚本的工具 5.配置httpd 6.浏览器查看 7.设置 保存 8.检查有无问题 9.共享 10.测试 11编辑配置文件37及后的脚本,并注释掉 27 28 12.安装pxe 13.共享pxelinux.0数据文件的网络服务 14.查询…

函数实例讲解(四)

文章目录 提取不重复值(INDEX、MATCH、COUNTIF)1、INDEX2、MATCH3、COUNTIF 提取不重复的值的经典套路(LARGE、SMALL、ROW)1、ROW2、LARGE3、SMALL) 制作Excel动态查询表四舍五入函数(ROUND、ROUNDUP、ROUNDDOWN&#…

shell 环境变量

shell 变量加载顺序 set设置了当前shell进程的本地变量,本地变量只在当前shell的进程内有效,不会被子进程继承和传递。 env仅为将要执行的子进程设置环境变量。 export将一个shell本地变量提升为当前shell进程的环境变量,从而被子进程自动继…

搭建pxe网络安装环境

实验目的: 搭建pxe网络安装环境实现服务器自动部署 实验原理: PXE 网络安装环境实现服务器自动部署的实验原理为: 待安装的服务器(PXE 客户端)开机时,BIOS 设置从网络启动,向网络发送请求。…

54 GRE-VPN 点到点

一 理论 1 GRE 概念 GRE(Generic Routing Encapsulation,通用路由封装)协议用来对某种协议(如IP、以太网)的数据报文进行封装,使这些被封装的数据报文能够在另一个网络(如IP)中传…

职场“老油条”的常规操作,会让你少走许多弯路,尤其这三点

有句话说得好:“在成长的路上,要么受教育,要么受教训。” 挨过打才知道疼,吃过亏才变聪明,从职场“老油条”身上能学到很多经验,不一定全对,但至少有可以借鉴的地方,至少能让你少走…

Python实现AI自动化微信回复脚本

脚本相关技术 wxauto Windows版本微信客户端(非网页版)自动化,可实现简单的发送、接收微信消息,简单微信机器人 GitHub地址 AI 脚本联动的是讯飞星火的api(主要是免费且无限token数(必须实名后才能领…

嵌入式学习---DAY19:标准IO

1. I: input 输入 键盘 O:output 输出 显示器 2.Linux操作系统当中IO都是对文件的操作,linux下一切皆文件,文件用来存储数据(数据,指令)。 3.stdio.h 标准输入输出头文件。 …

最新全国各省市水系矢量数据(2024年更新)

【数据简介】 来源于OSM在2024年7月份更新的全国范围的水系数据,并将其处理成各省、各市区域。OpenStreetMap(OSM)是一个全球性的开放源地图项目,旨在通过用户合作创建一个免费的、可编辑的世界地图。其数据广泛被用于地理信息系…

K个一组翻转链表(LeetCode)

题目 给你链表的头节点 ,每 个节点一组进行翻转,请你返回修改后的链表。 是一个正整数,它的值小于或等于链表的长度。如果节点总数不是 的整数倍,那么请将最后剩余的节点保持原有顺序。 你不能只是单纯的改变节点内部的值&…

Java单元覆盖率工具JaCoCo使用指南

JaCoCo(Java Code Coverage Library)是一款开源的Java代码覆盖率工具,它提供了详细的代码覆盖信息,帮助开发人员了解测试用例对代码的覆盖情况,从而发现潜在的问题和改进空间。以下是关于JaCoCo的详细介绍:…

[算法]二叉搜索树(BST)

二叉搜索树(Binary Search Tree),也称二叉排序树或二叉查找树。 一、二叉搜索树的性质 二叉搜索树是一棵二叉树,可以为空。 当二叉搜索树不为空时: 1、非空左子树的所有键值小于其根结点的键值。 2、非空右子树的所有…

linux开发板配置-双网卡桥接-问题记录

韦东山i.mx6ull开发板 使用usb网卡连接电脑,ubuntu进行双网卡配置 问题: mount: mounting 192.168.96.131:/home/book/nfs_rootfs on /mnt failed: No route to host 乱七八糟各种配置,能ping通,但是nfs挂载不上 解决&#xff1a…

推荐系统与搜索系统架构

一、推荐系统逻辑 推荐的本质就是为了解决信息过载造成的“选择困难症”,便于用户能够在自己选物之前,系统已经帮用户筛选到了最想要的信息。 以下是我按照用户打开APP进入推荐页面时,推荐系统返回给该用户推荐列表的整体流程: …

Xshell安装图文

1.下载 通过百度网盘分享的文件:Xshell安装图文 链接:https://pan.baidu.com/s/1k4ShbhUVQmdxpM9H8UYOSQ 提取码:kdxz --来自百度网盘超级会员V3的分享 2.安装 3.连接与使用 见下载