精华置顶
墙裂推荐!小白如何1个月系统学习CV核心知识:链接
点击@CV计算机视觉,关注更多CV干货
论文已打包,点击进入—>下载界面
点击加入—>CV计算机视觉交流群
1.【基础网络架构】(WACV2024)SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers
-
论文地址:https://arxiv.org//pdf/2311.03747
-
开源代码:GitHub - xyongLu/SBCFormer: SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers - WACV2024
2.【基础网络架构:Transformer】FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
-
论文地址:https://arxiv.org//pdf/2311.03912
-
开源代码:GitHub - shadowpa0327/FLORA
3.【基础网络架构:Transformer】Mini but Mighty: Finetuning ViTs with Mini Adapters
-
论文地址:https://arxiv.org//pdf/2311.03873
-
开源代码(即将开源):GitHub - IemProg/MiMi: Mini but Mighty: Finetuning ViTs with Mini Adapters
4.【图像分类】A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis
-
论文地址:https://arxiv.org//pdf/2311.04157
-
开源代码:GitHub - Imageomics/INTR: This is an official implementation for INTR: Interpretable Transformer for Fine-grained Image Classification
5.【异常检测】Image-Pointcloud Fusion based Anomaly Detection using PD-REAL Dataset
-
论文地址:https://arxiv.org//pdf/2311.04095
-
开源代码(即将开源):GitHub - Andy-cs008/PD-REAL
6.【车道线检测】Augmenting Lane Perception and Topology Understanding with Standard Definition Navigation Maps
-
论文地址:https://arxiv.org//pdf/2311.04079
-
开源代码(即将开源):https://github.com/NVlabs/SMERF
7.【3D目标检测】mmFUSION: Multimodal Fusion for 3D Objects Detection
-
论文地址:https://arxiv.org//pdf/2311.04058
-
代码即将开源
8.【行人重识别】Multi-view Information Integration and Propagation for Occluded Person Re-identification
-
论文地址:https://arxiv.org//pdf/2311.03828
-
开源代码:GitHub - nengdong96/MVIIP
9.【多模态】OtterHD: A High-Resolution Multi-modality Model
-
论文地址:https://arxiv.org//pdf/2311.04219
-
开源代码:GitHub - Luodian/Otter: 🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
10.【多模态】(WACV2024)Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining
-
论文地址:https://arxiv.org//pdf/2311.03964
-
工程主页:Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining
-
开源代码:GitHub - ugorsahin/Generative-Negative-Mining: Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining, WACV 2024
11.【多任务学习】Data exploitation: multi-task learning of object detection and semantic segmentation on partially annotated data
-
论文地址:https://arxiv.org//pdf/2311.04040
-
开源代码(即将开源):https://github.com/lhoangan/multas
12.【Diffusion】I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
-
论文地址:https://arxiv.org//pdf/2311.04145
-
工程主页:I2VGen-XL
-
开源代码:GitHub - damo-vilab/i2vgen-xl: Official repo for I2VGen-XL: High-Quality Image-to-Video Synthesis Via Cascaded Diffusion Models
13.【NeRF】UP-NeRF: Unconstrained Pose-Prior-Free Neural Radiance Fields
-
论文地址:https://arxiv.org//pdf/2311.03784
-
开源代码(即将开源):https://github.com/mlvlab/UP-NeRF
论文已打包,下载链接
CV计算机视觉交流群
群内包含目标检测、图像分割、目标跟踪、Transformer、多模态、NeRF、GAN、缺陷检测、显著目标检测、关键点检测、超分辨率重建、SLAM、人脸、OCR、生物医学图像、三维重建、姿态估计、自动驾驶感知、深度估计、视频理解、行为识别、图像去雾、图像去雨、图像修复、图像检索、车道线检测、点云目标检测、点云分割、图像压缩、运动预测、神经网络量化、网络部署等多个领域的大佬,不定期分享技术知识、面试技巧和内推招聘信息。
想进群的同学请添加微信号联系管理员:PingShanHai666。添加好友时请备注:学校/公司+研究方向+昵称。
推荐阅读:
CV计算机视觉每日开源代码Paper with code速览-2023.11.7
CV计算机视觉每日开源代码Paper with code速览-2023.11.6
CV计算机视觉每日开源代码Paper with code速览-2023.11.3
CV计算机视觉每日开源代码Paper with code速览-2023.11.2
CV计算机视觉每日开源代码Paper with code速览-2023.11.1
CV计算机视觉每日开源代码Paper with code速览-2023.10.31