CV计算机视觉每日开源代码Paper with code速览-2023.11.8

news2026/2/14 16:29:05

精华置顶

墙裂推荐！小白如何1个月系统学习CV核心知识：链接

点击@CV计算机视觉，关注更多CV干货

论文已打包，点击进入—>下载界面

点击加入—>CV计算机视觉交流群

1.【基础网络架构】（WACV2024）SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers

论文地址：https://arxiv.org//pdf/2311.03747
开源代码：GitHub - xyongLu/SBCFormer: SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers - WACV2024

2.【基础网络架构：Transformer】FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer

论文地址：https://arxiv.org//pdf/2311.03912
开源代码：GitHub - shadowpa0327/FLORA

3.【基础网络架构：Transformer】Mini but Mighty: Finetuning ViTs with Mini Adapters

论文地址：https://arxiv.org//pdf/2311.03873
开源代码（即将开源）：GitHub - IemProg/MiMi: Mini but Mighty: Finetuning ViTs with Mini Adapters

4.【图像分类】A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis

论文地址：https://arxiv.org//pdf/2311.04157
开源代码：GitHub - Imageomics/INTR: This is an official implementation for INTR: Interpretable Transformer for Fine-grained Image Classification

5.【异常检测】Image-Pointcloud Fusion based Anomaly Detection using PD-REAL Dataset

论文地址：https://arxiv.org//pdf/2311.04095
开源代码（即将开源）：GitHub - Andy-cs008/PD-REAL

6.【车道线检测】Augmenting Lane Perception and Topology Understanding with Standard Definition Navigation Maps

论文地址：https://arxiv.org//pdf/2311.04079
开源代码（即将开源）：https://github.com/NVlabs/SMERF

7.【3D目标检测】mmFUSION: Multimodal Fusion for 3D Objects Detection

论文地址：https://arxiv.org//pdf/2311.04058
代码即将开源

8.【行人重识别】Multi-view Information Integration and Propagation for Occluded Person Re-identification

论文地址：https://arxiv.org//pdf/2311.03828
开源代码：GitHub - nengdong96/MVIIP

9.【多模态】OtterHD: A High-Resolution Multi-modality Model

论文地址：https://arxiv.org//pdf/2311.04219
开源代码：GitHub - Luodian/Otter: 🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

10.【多模态】（WACV2024）Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining

论文地址：https://arxiv.org//pdf/2311.03964
工程主页：Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining
开源代码：GitHub - ugorsahin/Generative-Negative-Mining: Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining, WACV 2024

11.【多任务学习】Data exploitation: multi-task learning of object detection and semantic segmentation on partially annotated data

论文地址：https://arxiv.org//pdf/2311.04040
开源代码（即将开源）：https://github.com/lhoangan/multas

12.【Diffusion】I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

论文地址：https://arxiv.org//pdf/2311.04145
工程主页：I2VGen-XL
开源代码：GitHub - damo-vilab/i2vgen-xl: Official repo for I2VGen-XL: High-Quality Image-to-Video Synthesis Via Cascaded Diffusion Models

13.【NeRF】UP-NeRF: Unconstrained Pose-Prior-Free Neural Radiance Fields

论文地址：https://arxiv.org//pdf/2311.03784
开源代码（即将开源）：https://github.com/mlvlab/UP-NeRF

论文已打包，下载链接

CV计算机视觉交流群

群内包含目标检测、图像分割、目标跟踪、Transformer、多模态、NeRF、GAN、缺陷检测、显著目标检测、关键点检测、超分辨率重建、SLAM、人脸、OCR、生物医学图像、三维重建、姿态估计、自动驾驶感知、深度估计、视频理解、行为识别、图像去雾、图像去雨、图像修复、图像检索、车道线检测、点云目标检测、点云分割、图像压缩、运动预测、神经网络量化、网络部署等多个领域的大佬，不定期分享技术知识、面试技巧和内推招聘信息。

想进群的同学请添加微信号联系管理员：PingShanHai666。添加好友时请备注：学校/公司+研究方向+昵称。

推荐阅读：

CV计算机视觉每日开源代码Paper with code速览-2023.11.7

CV计算机视觉每日开源代码Paper with code速览-2023.11.6

CV计算机视觉每日开源代码Paper with code速览-2023.11.3

CV计算机视觉每日开源代码Paper with code速览-2023.11.2

CV计算机视觉每日开源代码Paper with code速览-2023.11.1

CV计算机视觉每日开源代码Paper with code速览-2023.10.31