半小时在本地部署DeepSeek的Janus Pro,进行图片分析和文生图
- 下载Janus Pro源代码
- 下载模型文件
- 创建Python虚拟环境
- 安装依赖包
- Janus Pro测试
- 运行程序
- 图片分析测试
- 文生图测试
- 使用中文提示词
- 使用英文提示词
测试印象:
- 整体模型体积较小,个人可以部署并使用。
- 图像识别效果不错,不但可以识别一般的图片,也可以识别一些图纸类的内容。
- 显存占用不高,24G显存即可以运行图片识别和图像生成。
- 部署相对简单,如果不考虑模型下载时间,半小时就可以进行简单测试。
- 图片识别时,如果图片是格式复杂的文本类(如试卷),在OCR时,会出现较严重的问题。
- 生成图片时,必须用英文。
- 生成图片时,当提示词过于简单时,会出现物品不完整的现象。
总体来说,在开源的多模态大模型中,还是一个非常不错的产品。
下载Janus Pro源代码
项目地址:
https://github.com/deepseek-ai/Janus
克隆源代码
git clone https://github.com/deepseek-ai/Janus.git
演示如下:
(base) ubuntu@ubuntu-server:~/study$ git clone https://github.com/deepseek-ai/Janus.git
Cloning into 'Janus'...
remote: Enumerating objects: 121, done.
remote: Counting objects: 100% (74/74), done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 121 (delta 51), reused 36 (delta 36), pack-reused 47 (from 2)
Receiving objects: 100% (121/121), 7.19 MiB | 3.20 MiB/s, done.
Resolving deltas: 100% (57/57), done.
(base) ubuntu@ubuntu-server:~/study$
下载模型文件
模型文件地址:
https://hf-mirror.com/deepseek-ai/Janus-Pro-7B
https://modelscope.cn/models/deepseek-ai/Janus-Pro-7B
将模型文件全部下载到本地目录。
也可以使用git clone
克隆模型。
进入源代码目录后,执行下面的克隆命令。
git clone https://hf-mirror.com/deepseek-ai/Janus-Pro-7B
这样操作简单,但有时会将.git
目录进行下载,时间会更长一些。
说明:
- 可以将模型下载到本地的任何目录。后面在代码中,会指定模型的路径。下载到源代码中,可以使用相对路径。
- 这里进行这样的操作,主要是操作更简单,手动下载每个文件是最快的方式。
- 下载模型可以使用modescop的客户端下载,需要先安装一个客户端,这个方案是更优的方法。
如通过modelscope
下载模型:
pip install modelscope
modelscope download --model deepseek-ai/Janus-Pro-7B --local_dir ./model_dir
模型下载过程:
(base) ubuntu@ubuntu-server:~/study$ cd Janus/
(base) ubuntu@ubuntu-server:~/study/Janus$ pwd
/home/ubuntu/study/Janus
(base) ubuntu@ubuntu-server:~/study/Janus$ git clone https://hf-mirror.com/deepseek-ai/Janus-Pro-7B
Cloning into 'Janus-Pro-7B'...
remote: Enumerating objects: 24, done.
remote: Counting objects: 100% (20/20), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 24 (delta 4), reused 0 (delta 0), pack-reused 4 (from 1)
Unpacking objects: 100% (24/24), 1.95 MiB | 1.24 MiB/s, done.
Filtering content: 100% (2/2), 1.82 GiB | 652.00 KiB/s, done.
Encountered 2 file(s) that may not have been copied correctly on Windows:
pytorch_model-00002-of-00002.bin
pytorch_model-00001-of-00002.bin
See: `git lfs help smudge` for more details.
(base) ubuntu@ubuntu-server:~/study/Janus$
下载完成后的目录结构
(base) ubuntu@ubuntu-server:~/study/Janus$ tree
.
├── demo
│ ├── app_janusflow.py
│ ├── app_januspro.py
│ ├── app.py
│ ├── fastapi_app.py
│ ├── fastapi_client.py
│ └── Janus_colab_demo.ipynb
├── generation_inference.py
├── images
│ ├── badge.svg
│ ├── doge.png
│ ├── equation.png
│ ├── logo.png
│ ├── logo.svg
│ ├── pie_chart.png
│ ├── teaser_janusflow.png
│ ├── teaser_januspro.png
│ ├── teaser.png
│ └── ve.png
├── inference.py
├── interactivechat.py
├── janus
│ ├── __init__.py
│ ├── janusflow
│ │ ├── __init__.py
│ │ └── models
│ │ ├── clip_encoder.py
│ │ ├── image_processing_vlm.py
│ │ ├── __init__.py
│ │ ├── modeling_vlm.py
│ │ ├── processing_vlm.py
│ │ ├── siglip_vit.py
│ │ └── uvit.py
│ ├── models
│ │ ├── clip_encoder.py
│ │ ├── image_processing_vlm.py
│ │ ├── __init__.py
│ │ ├── modeling_vlm.py
│ │ ├── processing_vlm.py
│ │ ├── projector.py
│ │ ├── siglip_vit.py
│ │ └── vq_model.py
│ └── utils
│ ├── conversation.py
│ ├── __init__.py
│ └── io.py
├── Janus-Pro-7B
│ ├── config.json
│ ├── janus_pro_teaser1.png
│ ├── janus_pro_teaser2.png
│ ├── preprocessor_config.json
│ ├── processor_config.json
│ ├── pytorch_model-00001-of-00002.bin
│ ├── pytorch_model-00002-of-00002.bin
│ ├── pytorch_model.bin.index.json
│ ├── README.md
│ ├── special_tokens_map.json
│ ├── tokenizer_config.json
│ └── tokenizer.json
├── janus_pro_tech_report.pdf
├── LICENSE-CODE
├── LICENSE-MODEL
├── Makefile
├── pyproject.toml
├── README.md
└── requirements.txt
创建Python虚拟环境
conda create -n janus_pro python=3.10
conda activate janus_pro
过程如下:
(base) ubuntu@ubuntu-server:~/study/Janus$ conda create -n janus_pro python=3.10
Retrieving notices: ...working... done
Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 22.11.1
latest version: 25.1.1
Please update conda by running
$ conda update -n base -c defaults conda
Or to minimize the number of packages updated during conda update use
conda install conda=25.1.1
## Package Plan ##
environment location: /home/ubuntu/miniconda3/envs/janus_pro
added / updated specs:
- python=3.10
The following NEW packages will be INSTALLED:
_libgcc_mutex pkgs/main/linux-64::_libgcc_mutex-0.1-main
_openmp_mutex pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu
bzip2 pkgs/main/linux-64::bzip2-1.0.8-h5eee18b_6
ca-certificates pkgs/main/linux-64::ca-certificates-2024.12.31-h06a4308_0
ld_impl_linux-64 pkgs/main/linux-64::ld_impl_linux-64-2.40-h12ee557_0
libffi pkgs/main/linux-64::libffi-3.4.4-h6a678d5_1
libgcc-ng pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1
libgomp pkgs/main/linux-64::libgomp-11.2.0-h1234567_1
libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1
libuuid pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0
ncurses pkgs/main/linux-64::ncurses-6.4-h6a678d5_0
openssl pkgs/main/linux-64::openssl-3.0.15-h5eee18b_0
pip pkgs/main/linux-64::pip-25.0-py310h06a4308_0
python pkgs/main/linux-64::python-3.10.16-he870216_1
readline pkgs/main/linux-64::readline-8.2-h5eee18b_0
setuptools pkgs/main/linux-64::setuptools-75.8.0-py310h06a4308_0
sqlite pkgs/main/linux-64::sqlite-3.45.3-h5eee18b_0
tk pkgs/main/linux-64::tk-8.6.14-h39e8969_0
tzdata pkgs/main/noarch::tzdata-2025a-h04d1e81_0
wheel pkgs/main/linux-64::wheel-0.45.1-py310h06a4308_0
xz pkgs/main/linux-64::xz-5.4.6-h5eee18b_1
zlib pkgs/main/linux-64::zlib-1.2.13-h5eee18b_1
Proceed ([y]/n)? y
Downloading and Extracting Packages
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate janus_pro
#
# To deactivate an active environment, use
#
# $ conda deactivate
(base) ubuntu@ubuntu-server:~/study/Janus$ conda activate janus_pro
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$
安装依赖包
pip install -r requirements.txt
安装后的依赖环境信息
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$ pip list
Package Version
------------------------- -----------
accelerate 1.3.0
aiofiles 23.2.1
altair 5.5.0
annotated-types 0.7.0
anyio 4.8.0
attrdict 2.0.1
attrs 25.1.0
certifi 2025.1.31
charset-normalizer 3.4.1
click 8.1.8
cmake 3.31.4
colorama 0.4.5
contourpy 1.3.1
cycler 0.12.1
einops 0.8.1
exceptiongroup 1.2.2
fastapi 0.115.8
ffmpy 0.5.0
filelock 3.17.0
fonttools 4.56.0
fsspec 2025.2.0
gradio 3.48.0
gradio_client 0.6.1
h11 0.14.0
httpcore 1.0.7
httpx 0.28.1
huggingface-hub 0.28.1
idna 3.10
importlib_resources 6.5.2
Jinja2 3.1.5
jsonschema 4.23.0
jsonschema-specifications 2024.10.1
kiwisolver 1.4.8
latex2mathml 3.77.0
lit 18.1.8
Markdown 3.4.1
MarkupSafe 2.1.5
matplotlib 3.10.0
mdtex2html 1.3.0
mpmath 1.3.0
narwhals 1.26.0
networkx 3.4.2
numpy 1.26.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
orjson 3.10.15
packaging 24.2
pandas 2.2.3
pillow 10.4.0
pip 25.0
psutil 6.1.1
pydantic 2.10.6
pydantic_core 2.27.2
pydub 0.25.1
Pygments 2.12.0
pyparsing 3.2.1
pypinyin 0.50.0
python-dateutil 2.9.0.post0
python-multipart 0.0.20
pytz 2025.1
PyYAML 6.0.2
referencing 0.36.2
regex 2024.11.6
requests 2.32.3
rpds-py 0.22.3
safetensors 0.5.2
semantic-version 2.10.0
sentencepiece 0.1.96
setuptools 75.8.0
six 1.17.0
sniffio 1.3.1
starlette 0.45.3
sympy 1.13.3
tiktoken 0.5.2
timm 1.0.14
tokenizers 0.21.0
torch 2.0.1
torchvision 0.15.2
tqdm 4.64.0
transformers 4.48.3
triton 2.0.0
typing_extensions 4.12.2
tzdata 2025.1
urllib3 2.3.0
uvicorn 0.34.0
websockets 11.0.3
wheel 0.45.1
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$
安装后的环境在进行图片分析时,会报错。
Janus Pro测试
运行程序
(janus_pro) ubuntu@ubuntu-server:~/study/Janus/demo$ pwd
/home/ubuntu/study/Janus/demo
(janus_pro) ubuntu@ubuntu-server:~/study/Janus/demo$ cat app_januspro.py | grep model_path
model_path = "deepseek-ai/Janus-Pro-7B"
config = AutoConfig.from_pretrained(model_path)
vl_gpt = AutoModelForCausalLM.from_pretrained(model_path,
vl_chat_processor = VLChatProcessor.from_pretrained(model_path)
(janus_pro) ubuntu@ubuntu-server:~/study/Janus/demo$
修改模型的路径
直接的执行,会报错。
(janus_pro) ubuntu@ubuntu-server:~/study/Janus/demo$ python app_januspro.py
Traceback (most recent call last):
File "/home/ubuntu/study/Janus/demo/app_januspro.py", line 4, in <module>
from janus.models import MultiModalityCausalLM, VLChatProcessor
ModuleNotFoundError: No module named 'janus'
(janus_pro) ubuntu@ubuntu-server:~/study/Janus/demo$
使用下面的方式进行解决。
pip install -e .
演示如下:
(janus_pro) ubuntu@ubuntu-server:~/study/Janus/demo$ cd ..
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$ pip install -e .
Looking in indexes: https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
Obtaining file:///home/ubuntu/study/Janus
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Preparing editable metadata (pyproject.toml) ... done
Requirement already satisfied: torch>=2.0.1 in /home/ubuntu/miniconda3/envs/janus_pro/lib/python3.10/site-packages (from janus==1.0.0) (2.0.1)
Requirement already satisfied: transformers>=4.38.2 in /home/ubuntu/miniconda3/envs/janus_pro/lib/python3.10/site-packages (from janus==1.0.0) (4.48.3)
......
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /home/ubuntu/miniconda3/envs/janus_pro/lib/python3.10/site-packages (from torchvision->timm>=0.9.16->janus==1.0.0) (10.4.0)
Building wheels for collected packages: janus
Building editable for janus (pyproject.toml) ... done
Created wheel for janus: filename=janus-1.0.0-0.editable-py3-none-any.whl size=15963 sha256=0c33ddac3d9b813d27f9c12d27d1397f5b19aefd3f8b91ed3727361dd996759b
Stored in directory: /tmp/pip-ephem-wheel-cache-g69rs9_z/wheels/e8/bc/98/51f151a24aecfd261ac28237115419ab4b5e2f876035dd9a46
Successfully built janus
Installing collected packages: janus
Successfully installed janus-1.0.0
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$
在/home/ubuntu/study/Janus
目录中运行程序(这个目录位置很重要)
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$ pwd
/home/ubuntu/study/Janus
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$ python demo/app_januspro.py
Python version is above 3.10, patching the collections module.
/home/ubuntu/miniconda3/envs/janus_pro/lib/python3.10/site-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
/home/ubuntu/miniconda3/envs/janus_pro/lib/python3.10/site-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
/home/ubuntu/miniconda3/envs/janus_pro/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py:590: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
warnings.warn(
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]/home/ubuntu/miniconda3/envs/janus_pro/lib/python3.10/site-packages/torch/_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
Loading checkpoint shards: 100%|????????????????????????????????????????????????????????| 2/2 [00:09<00:00, 4.89s/it]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Some kwargs in processor config are unused and will not have any effect: add_special_token, mask_prompt, image_tag, sft_format, num_image_tokens, ignore_id.
Running on local URL: http://127.0.0.1:7860
IMPORTANT: You are using gradio version 3.48.0, however version 4.44.1 is available, please upgrade.
--------
Running on public URL: https://8737eac9b03202f1c7.gradio.live
This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
由于运行默认地址为http://127.0.0.1:7860
,只能在服务器上进行访问。
修改代码中的launch,加入server_name="0.0.0.0"
,并再次运行。
(base) ubuntu@ubuntu-server:~/study/Janus/demo$ tail app_januspro.py
generation_button.click(
fn=generate_image,
inputs=[prompt_input, seed_input, cfg_weight_input, t2i_temperature],
outputs=image_output
)
# demo.launch(share=True)
demo.launch(server_name="0.0.0.0", share=True)
# demo.queue(concurrency_count=1, max_size=10).launch(server_name="0.0.0.0", server_port=37906, root_path="/path")
(base) ubuntu@ubuntu-server:~/study/Janus/demo$
现在就可以在其它电脑上访问了。
图片分析测试
上传一个图片,进行测试。
运行报错
程序会报RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16'
。
这个问题可以通过升级torch
的版本来解决,默认的为2.0.1
。
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$ pip list | grep torch
torch 2.0.1
torchvision 0.15.2
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$
升级到2.2.2
pip install -U torch==2.2.2
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$ pip list | grep torch
torch 2.2.2
torchvision 0.15.2
(janus_pro) ubuntu@ubuntu-server:~/study/Janus$
可以正常分析图片了。
分析一个有难度的图片,来个线路图让它看看。
还是比较厉害的,虽然我没看懂,但是分析的结果像那么回事。
再来一张 测试一下。
测试一下小学生的试卷
这个图片分析的不太理想,内容识别出现大量的错误。
图片分析时的显存占用情况:
(base) ubuntu@ubuntu-server:~$ nvidia-smi
Tue Feb 11 09:45:08 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 On | Off |
| 0% 38C P8 14W / 450W | 16271MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 Off | 00000000:05:00.0 Off | Off |
| 0% 33C P8 15W / 450W | 926MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 4378 G /usr/lib/xorg/Xorg 205MiB |
| 0 N/A N/A 951857 C python 15320MiB |
| 0 N/A N/A 1261524 C ...erProcess --variations-seed-version 728MiB |
| 1 N/A N/A 4378 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 1261524 C ...erProcess --variations-seed-version 908MiB |
+-----------------------------------------------------------------------------------------+
(base) ubuntu@ubuntu-server:~$
文生图测试
使用中文提示词
当使用中文时,生成的内容完全不对。
使用英文提示词
一次可以生成多张图片。
生成的图片是768 * 768
的。
使用复杂的提示词,可以生成更高质量的图片。
显存使用情况
Tue Feb 11 09:59:55 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 On | Off |
| 0% 38C P8 13W / 450W | 22191MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 Off | 00000000:05:00.0 Off | Off |
| 0% 32C P8 15W / 450W | 926MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 4378 G /usr/lib/xorg/Xorg 205MiB |
| 0 N/A N/A 951857 C python 21240MiB |
| 0 N/A N/A 1261524 C ...erProcess --variations-seed-version 728MiB |
| 1 N/A N/A 4378 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 1261524 C ...erProcess --variations-seed-version 908MiB |
+-----------------------------------------------------------------------------------------+
(base) ubuntu@ubuntu-server:~$
具体的应用情况,请小伙伴们自己测试吧。