快速结论
1、用茴香豆快速搭建Dify问答助手,自带拒答、rerank、切片长度判断、阈值调节功能,回答还能带出图片,顶呱呱👍
2、茴香豆git仓地址:https://github.com/internlm/huixiangdou
遇到问题去翻这里会更多解释~
配置环境
按照教程配置环境,最好一条一条写cli,中间会有要手动输入“YES”的。
(注:1、教程里面写的BCEmbedding0.15错了,应该是BCEmbedding0.1.5
2、huixiangdou仓里的requirements.txt有好多好多依赖,重点记得transformers 要>=4.38就好)
cd /root
git clone https://github.com/internlm/huixiangdou && cd huixiangdou
git checkout 79fa810
apt update
apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev
# python requirements
pip install BCEmbedding==0.1.5 cmake==3.30.2 lit==18.1.8 sentencepiece==0.2.0 protobuf==5.27.3 accelerate==0.33.0
pip install -r requirements.txt
# python3.8 安装 faiss-gpu 而不是 faiss
创建知识库
用dify的的repo作为知识库文档:)
cd /root/huixiangdou && mkdir repodir
git clone https://github.com/internlm/huixiangdou --depth=1 repodir/huixiangdou
git clone https://github.com/open-mmlab/mmpose --depth=1 repodir/mmpose
# Save the features of repodir to workdir, and update the positive and negative example thresholds into `config.ini`
mkdir workdir
python3 -m huixiangdou.service.feature_store
/root/.conda/envs/xtuner/lib/python3.10/runpy.py:126: RuntimeWarning: 'huixiangdou.service.feature_store' found in sys.modules after import of package 'huixiangdou.service', but prior to execution of 'huixiangdou.service.feature_store'; this may result in unpredictable behaviour
warn(RuntimeWarning(msg))
2024-10-06 19:47:30.986 | INFO | huixiangdou.service.retriever:__init__:262 - loading test2vec and rerank models
/root/.conda/envs/xtuner/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
10/06/2024 19:48:24 - [INFO] -BCEmbedding.models.RerankerModel->>> Loading from `/root/models/bce-reranker-base_v1`.
10/06/2024 19:48:24 - [INFO] -BCEmbedding.models.RerankerModel->>> Execute device: cuda; gpu num: 1; use fp16: True
2024-10-06 19:48:25.063 | DEBUG | __main__:__init__:67 - loading text2vec model..
2024-10-06 19:48:25.064 | INFO | __main__:__init__:83 - init dense retrieval database with chunk_size 900
2024-10-06 19:48:50.761 | INFO | __main__:initialize:250 - initialize response and reject feature store, you only need call this once.
2024-10-06 19:48:55.795 | INFO | __main__:read_and_save:31 - reading repodir2/dify/api/templates/invite_member_mail_template_en-US.html, would save to workdir/preprocess/0059f8d4.text
2024-10-06 19:48:55.800 | INFO | __main__:read_and_save:31 - reading repodir2/dify/api/templates/invite_member_mail_template_zh-CN.html, would save to workdir/preprocess/1bfc6b42.text
2024-10-06 19:48:55.807 | INFO | __main__:read_and_save:31 - reading repodir2/dify/api/templates/reset_password_mail_template_en-US.html, would save to workdir/preprocess/56080a6a.text
2024-10-06 19:48:55.815 | INFO | __main__:read_and_save:31 - reading repodir2/dify/api/templates/reset_password_mail_template_zh-CN.html, would save to workdir/preprocess/04929e9e.text
2024-10-06 19:48:56.277 | DEBUG | __main__:preprocess:230 - waiting for file preprocess finish..
Token indices sequence length is longer than the specified maximum sequence length for this model (523 > 512). Running this sequence through the model will result in indexing errors
2024-10-06 19:48:57.267 | INFO | __main__:analyze:170 - text histogram, length count 1186, avg 664.15, median 870
0-94 5.73%
94-188 7.67%
188-282 6.24%
282-376 4.38%
376-470 3.71%
470-564 3.63%
564-658 3.79%
658-752 4.72%
752-846 7.34%
846-940 52.02%
940-1034 0.76%
2024-10-06 19:48:57.268 | INFO | __main__:analyze:171 - token histogram, length count 1186, avg 283.89, median 315
0-52 9.44%
52-104 11.21%
104-156 7.76%
156-208 6.49%
208-260 7.84%
260-312 7.17%
312-364 4.22%
364-416 8.01%
416-468 36.34%
468-520 1.43%
520-572 0.08%
10/06/2024 19:48:57 - [INFO] -faiss.loader->>> Loading faiss with AVX2 support.
10/06/2024 19:48:57 - [INFO] -faiss.loader->>> Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
10/06/2024 19:48:57 - [INFO] -faiss.loader->>> Loading faiss.
10/06/2024 19:48:57 - [INFO] -faiss.loader->>> Successfully loaded faiss.
0%| | 0/1186 [00:00<?, ?it/s]
0%| | 1/1186 [00:00<11:06, 1.78it/s]
1%| | 12/1186 [00:00<00:49, 23.65it/s]
2%|▏ | 26/1186 [00:00<00:23, 49.12it/s]
3%|▎ | 40/1186 [00:00<00:16, 69.66it/s]
5%|▍ | 54/1186 [00:00<00:13, 86.26it/s]
6%|▌ | 67/1186 [00:01<00:11, 97.72it/s]
7%|▋ | 81/1186 [00:01<00:10, 107.85it/s]
8%|▊ | 95/1186 [00:01<00:09, 115.14it/s]
9%|▉ | 108/1186 [00:01<00:09, 119.06it/s]
10%|█ | 121/1186 [00:01<00:08, 122.13it/s]
11%|█▏ | 135/1186 [00:01<00:08, 125.56it/s]
13%|█▎ | 149/1186 [00:01<00:08, 123.77it/s]
14%|█▎ | 163/1186 [00:01<00:08, 127.36it/s]
15%|█▍ | 176/1186 [00:01<00:07, 127.61it/s]
16%|█▌ | 189/1186 [00:02<00:07, 126.68it/s]
17%|█▋ | 203/1186 [00:02<00:07, 129.05it/s]
18%|█▊ | 217/1186 [00:02<00:07, 131.68it/s]
19%|█▉ | 231/1186 [00:02<00:07, 129.73it/s]
21%|██ | 245/1186 [00:02<00:08, 110.33it/s]
22%|██▏ | 259/1186 [00:02<00:07, 116.48it/s]
23%|██▎ | 273/1186 [00:02<00:07, 121.17it/s]
24%|██▍ | 286/1186 [00:02<00:07, 123.30it/s]
25%|██▌ | 300/1186 [00:02<00:07, 125.92it/s]
26%|██▋ | 314/1186 [00:03<00:06, 127.48it/s]
28%|██▊ | 327/1186 [00:03<00:06, 124.66it/s]
29%|██▉ | 341/1186 [00:03<00:06, 128.37it/s]
30%|██▉ | 354/1186 [00:03<00:06, 128.78it/s]
31%|███ | 368/1186 [00:03<00:06, 129.21it/s]
32%|███▏ | 382/1186 [00:03<00:06, 131.17it/s]
33%|███▎ | 396/1186 [00:03<00:06, 130.83it/s]
35%|███▍ | 410/1186 [00:03<00:06, 128.12it/s]
36%|███▌ | 424/1186 [00:03<00:05, 130.57it/s]
37%|███▋ | 438/1186 [00:03<00:05, 132.11it/s]
38%|███▊ | 452/1186 [00:04<00:05, 132.96it/s]
39%|███▉ | 466/1186 [00:04<00:06, 109.34it/s]
40%|████ | 480/1186 [00:04<00:06, 115.65it/s]
42%|████▏ | 494/1186 [00:04<00:05, 121.12it/s]
43%|████▎ | 508/1186 [00:04<00:05, 124.83it/s]
44%|████▍ | 521/1186 [00:04<00:05, 125.18it/s]
45%|████▌ | 535/1186 [00:04<00:05, 127.14it/s]
46%|████▋ | 549/1186 [00:04<00:04, 128.58it/s]
47%|████▋ | 562/1186 [00:04<00:04, 128.86it/s]
49%|████▊ | 576/1186 [00:05<00:04, 130.44it/s]
50%|████▉ | 590/1186 [00:05<00:04, 131.36it/s]
51%|█████ | 604/1186 [00:05<00:04, 130.75it/s]
52%|█████▏ | 618/1186 [00:05<00:04, 131.16it/s]
53%|█████▎ | 632/1186 [00:05<00:04, 129.96it/s]
54%|█████▍ | 646/1186 [00:05<00:04, 125.79it/s]
56%|█████▌ | 659/1186 [00:05<00:04, 126.38it/s]
57%|█████▋ | 673/1186 [00:05<00:04, 128.10it/s]
58%|█████▊ | 687/1186 [00:05<00:03, 129.13it/s]
59%|█████▉ | 701/1186 [00:06<00:03, 130.64it/s]
60%|██████ | 715/1186 [00:06<00:03, 130.53it/s]
61%|██████▏ | 729/1186 [00:06<00:03, 130.21it/s]
63%|██████▎ | 743/1186 [00:06<00:03, 131.04it/s]
64%|██████▍ | 757/1186 [00:06<00:03, 131.42it/s]
65%|██████▌ | 771/1186 [00:06<00:03, 129.83it/s]
66%|██████▌ | 785/1186 [00:06<00:03, 130.78it/s]
67%|██████▋ | 799/1186 [00:06<00:02, 130.95it/s]
69%|██████▊ | 813/1186 [00:06<00:02, 131.75it/s]
70%|██████▉ | 827/1186 [00:07<00:02, 131.35it/s]
71%|███████ | 841/1186 [00:07<00:02, 125.27it/s]
72%|███████▏ | 855/1186 [00:07<00:02, 127.57it/s]
73%|███████▎ | 869/1186 [00:07<00:02, 128.56it/s]
74%|███████▍ | 882/1186 [00:07<00:02, 128.90it/s]
76%|███████▌ | 896/1186 [00:07<00:02, 129.83it/s]
77%|███████▋ | 910/1186 [00:07<00:02, 130.16it/s]
78%|███████▊ | 924/1186 [00:07<00:02, 131.00it/s]
79%|███████▉ | 938/1186 [00:07<00:01, 130.77it/s]
80%|████████ | 952/1186 [00:07<00:01, 130.52it/s]
81%|████████▏ | 966/1186 [00:08<00:01, 130.53it/s]
83%|████████▎ | 980/1186 [00:08<00:01, 130.64it/s]
84%|████████▍ | 994/1186 [00:08<00:01, 128.66it/s]
85%|████████▍ | 1007/1186 [00:08<00:01, 127.59it/s]
86%|████████▌ | 1021/1186 [00:08<00:01, 130.11it/s]
87%|████████▋ | 1035/1186 [00:08<00:01, 122.07it/s]
88%|████████▊ | 1049/1186 [00:08<00:01, 125.99it/s]
90%|████████▉ | 1063/1186 [00:08<00:00, 128.37it/s]
91%|█████████ | 1077/1186 [00:08<00:00, 129.71it/s]
92%|█████████▏| 1091/1186 [00:09<00:00, 130.88it/s]
93%|█████████▎| 1105/1186 [00:09<00:00, 129.91it/s]
94%|█████████▍| 1119/1186 [00:09<00:00, 128.57it/s]
95%|█████████▌| 1132/1186 [00:09<00:00, 124.35it/s]
97%|█████████▋| 1146/1186 [00:09<00:00, 126.43it/s]
98%|█████████▊| 1160/1186 [00:09<00:00, 128.87it/s]
99%|█████████▉| 1173/1186 [00:09<00:00, 124.08it/s]
100%|██████████| 1186/1186 [00:09<00:00, 120.58it/s]
2024-10-06 19:49:07.886 | INFO | huixiangdou.primitive.file_operation:summarize:143 - 累计181文件,成功57个,跳过0个,异常124个
2024-10-06 19:49:08.694 | INFO | huixiangdou.service.retriever:update_throttle:82 - The optimal threshold is: 0.3854840287267922, saved it to config.ini
2024-10-06 19:49:09.274 | WARNING | __main__:test_reject:320 - process query: dify是什么
2024-10-06 19:49:09.298 | WARNING | __main__:test_reject:320 - process query: dify怎么用
2024-10-06 19:49:09.326 | WARNING | __main__:test_reject:320 - process query: dify是免费的吗
2024-10-06 19:49:09.342 | WARNING | __main__:test_reject:320 - process query: 介绍一下dify
2024-10-06 19:49:09.357 | WARNING | __main__:test_reject:320 - process query: dify是哪家公司的
2024-10-06 19:49:09.371 | WARNING | __main__:test_reject:320 - process query: dify是开源的嘛
2024-10-06 19:49:09.392 | WARNING | __main__:test_reject:320 - process query: 怎么搭建大模型workflow
2024-10-06 19:49:09.408 | WARNING | __main__:test_reject:320 - process query: 怎么搭建大模型工作流
2024-10-06 19:49:09.423 | WARNING | __main__:test_reject:320 - process query: dify搭建工作流要怎么做
2024-10-06 19:49:09.446 | WARNING | __main__:test_reject:320 - process query: 不懂代码可以做大模型应用嘛
2024-10-06 19:49:09.467 | WARNING | __main__:test_reject:320 - process query: dify怎么搭建workflow
2024-10-06 19:49:09.482 | WARNING | __main__:test_reject:320 - process query: what is dify
2024-10-06 19:49:09.497 | WARNING | __main__:test_reject:320 - process query: 如何在本地部署Dify?
2024-10-06 19:49:09.514 | WARNING | __main__:test_reject:320 - process query: 什么是RAG引擎,Dify是如何使用它的?
2024-10-06 19:49:09.529 | WARNING | __main__:test_reject:320 - process query: Dify的API和应用程序导向有什么区别
2024-10-06 19:49:09.543 | WARNING | __main__:test_reject:320 - process query: Dify的价格是多少,如何获取商业许可
2024-10-06 19:49:09.557 | WARNING | __main__:test_reject:320 - process query: Dify的Agent是什么,它有什么作用
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 10.86it/s]
2024-10-06 19:49:09.687 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README.md content length 12059
2024-10-06 19:49:09.693 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:09.698 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_VI.md content length 12467
2024-10-06 19:49:09.709 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:09.712 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='dify是什么' files:['repodir2/dify/README.md', 'repodir2/dify/README_JA.md', 'repodir2/dify/README_VI.md', 'repodir2/dify/README_CN.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 11.63it/s]
2024-10-06 19:49:09.890 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:09.907 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/CONTRIBUTING_CN.md content length 4916
2024-10-06 19:49:09.913 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/CONTRIBUTING_VI.md content length 9213
2024-10-06 19:49:09.917 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/CONTRIBUTING_VI.md content length 9213
2024-10-06 19:49:09.947 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='dify怎么用' files:['repodir2/dify/README_JA.md', 'repodir2/dify/CONTRIBUTING_CN.md', 'repodir2/dify/CONTRIBUTING_VI.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 7.19it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 3.92it/s]
2024-10-06 19:49:10.558 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:10.722 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KL.md content length 12243
2024-10-06 19:49:10.768 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:10.843 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:10.852 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='dify是免费的吗' files:['repodir2/dify/README_CN.md', 'repodir2/dify/README_KL.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 28.58it/s]
2024-10-06 19:49:10.917 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README.md content length 12059
2024-10-06 19:49:10.927 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:10.951 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_VI.md content length 12467
2024-10-06 19:49:10.965 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_ES.md content length 13387
2024-10-06 19:49:10.972 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='介绍一下dify' files:['repodir2/dify/README.md', 'repodir2/dify/README_JA.md', 'repodir2/dify/README_VI.md', 'repodir2/dify/README_ES.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 46.00it/s]
2024-10-06 19:49:11.049 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:11.051 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KL.md content length 12243
2024-10-06 19:49:11.064 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:11.070 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/templates/invite_member_mail_template_zh-CN.html content length 977
2024-10-06 19:49:11.074 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='dify是哪家公司的' files:['repodir2/dify/README_JA.md', 'repodir2/dify/README_KL.md', 'repodir2/dify/README_CN.md', 'repodir2/dify/api/templates/invite_member_mail_template_zh-CN.html']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 21.91it/s]
2024-10-06 19:49:11.154 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README.md content length 12059
2024-10-06 19:49:11.160 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:11.165 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:11.175 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KL.md content length 12243
2024-10-06 19:49:11.184 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='dify是开源的嘛' files:['repodir2/dify/README.md', 'repodir2/dify/README_JA.md', 'repodir2/dify/README_CN.md', 'repodir2/dify/README_KL.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 8.23it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 8.22it/s]
2024-10-06 19:49:11.337 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/schema.md content length 5974
2024-10-06 19:49:11.344 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/customizable_model_scale_out.md content length 6964
2024-10-06 19:49:11.353 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/provider_scale_out.md content length 4964
2024-10-06 19:49:11.361 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/provider_scale_out.md content length 4964
2024-10-06 19:49:11.369 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='怎么搭建大模型workflow' files:['repodir2/dify/api/core/model_runtime/docs/zh_Hans/schema.md', 'repodir2/dify/api/core/model_runtime/docs/zh_Hans/customizable_model_scale_out.md', 'repodir2/dify/api/core/model_runtime/docs/zh_Hans/provider_scale_out.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 58.18it/s]
2024-10-06 19:49:11.436 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/schema.md content length 5974
2024-10-06 19:49:11.448 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/customizable_model_scale_out.md content length 6964
2024-10-06 19:49:11.464 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/README_CN.md content length 2805
2024-10-06 19:49:11.472 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/provider_scale_out.md content length 4964
2024-10-06 19:49:11.480 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='怎么搭建大模型工作流' files:['repodir2/dify/api/core/model_runtime/docs/zh_Hans/schema.md', 'repodir2/dify/api/core/model_runtime/docs/zh_Hans/customizable_model_scale_out.md', 'repodir2/dify/api/core/model_runtime/README_CN.md', 'repodir2/dify/api/core/model_runtime/docs/zh_Hans/provider_scale_out.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 36.34it/s]
2024-10-06 19:49:11.571 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:11.577 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_VI.md content length 12467
2024-10-06 19:49:11.587 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KL.md content length 12243
2024-10-06 19:49:11.597 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:11.605 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='dify搭建工作流要怎么做' files:['repodir2/dify/README_JA.md', 'repodir2/dify/README_VI.md', 'repodir2/dify/README_KL.md', 'repodir2/dify/README_CN.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 117.14it/s]
2024-10-06 19:49:11.638 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/model_runtime/docs/zh_Hans/schema.md content length 5974
2024-10-06 19:49:11.642 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='不懂代码可以做大模型应用嘛' files:['repodir2/dify/api/core/model_runtime/docs/zh_Hans/schema.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 12.62it/s]
2024-10-06 19:49:11.866 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/api/core/tools/README_JP.md content length 847
2024-10-06 19:49:11.868 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:11.877 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README.md content length 12059
2024-10-06 19:49:12.155 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/CONTRIBUTING_JA.md content length 6133
2024-10-06 19:49:12.155 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='dify怎么搭建workflow' files:['repodir2/dify/api/core/tools/README_JP.md', 'repodir2/dify/README_JA.md', 'repodir2/dify/README.md', 'repodir2/dify/CONTRIBUTING_JA.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 28.98it/s]
2024-10-06 19:49:12.627 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README.md content length 12059
2024-10-06 19:49:12.734 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:12.891 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_VI.md content length 12467
2024-10-06 19:49:13.000 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KL.md content length 12243
2024-10-06 19:49:13.017 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='what is dify' files:['repodir2/dify/README.md', 'repodir2/dify/README_JA.md', 'repodir2/dify/README_VI.md', 'repodir2/dify/README_KL.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 15.81it/s]
2024-10-06 19:49:13.122 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KR.md content length 9643
2024-10-06 19:49:13.124 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/CONTRIBUTING_VI.md content length 9213
2024-10-06 19:49:13.136 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_VI.md content length 12467
2024-10-06 19:49:13.145 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/CONTRIBUTING_CN.md content length 4916
2024-10-06 19:49:13.154 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='如何在本地部署Dify?' files:['repodir2/dify/README_KR.md', 'repodir2/dify/CONTRIBUTING_VI.md', 'repodir2/dify/README_VI.md', 'repodir2/dify/CONTRIBUTING_CN.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 114.30it/s]
2024-10-06 19:49:13.184 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:13.212 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='什么是RAG引擎,Dify是如何使用它的?' files:['repodir2/dify/README_CN.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 39.16it/s]
2024-10-06 19:49:13.332 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:13.383 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:13.404 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KL.md content length 12243
2024-10-06 19:49:13.462 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:13.556 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='Dify的API和应用程序导向有什么区别' files:['repodir2/dify/README_CN.md', 'repodir2/dify/README_JA.md', 'repodir2/dify/README_KL.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 20.95it/s]
2024-10-06 19:49:13.934 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:14.147 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_AR.md content length 11486
2024-10-06 19:49:14.231 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:14.304 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KR.md content length 9643
2024-10-06 19:49:14.308 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='Dify的价格是多少,如何获取商业许可' files:['repodir2/dify/README_JA.md', 'repodir2/dify/README_AR.md', 'repodir2/dify/README_CN.md', 'repodir2/dify/README_KR.md']
Calculate scores: 0%| | 0/1 [00:00<?, ?it/s]
Calculate scores: 100%|██████████| 1/1 [00:00<00:00, 29.01it/s]
2024-10-06 19:49:14.396 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_JA.md content length 9476
2024-10-06 19:49:14.402 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_KL.md content length 12243
2024-10-06 19:49:14.412 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_VI.md content length 12467
2024-10-06 19:49:14.425 | INFO | huixiangdou.service.retriever:rerank_fuse:152 - target repodir2/dify/README_CN.md content length 9396
2024-10-06 19:49:14.434 | DEBUG | huixiangdou.service.retriever:rerank_fuse:181 - query:text='Dify的Agent是什么,它有什么作用' files:['repodir2/dify/README_JA.md', 'repodir2/dify/README_KL.md', 'repodir2/dify/README_VI.md', 'repodir2/dify/README_CN.md']
2024-10-06 19:49:14.447 | INFO | __main__:test_query:360 -
+----------------------+----------+----------------------+---------------------+
| Query | State | Part of Chunks | References |
+======================+==========+======================+=====================+
| dify是什么 | Accepted | </p> | README.md,README_JA |
| | | dify is an open- | .md,README_VI.md,RE |
| | | source llm app | ADME_CN.md |
| | | development | |
| | | platform. its | |
| | | intuitive interface | |
| | | combines ai wor.. | |
+----------------------+----------+----------------------+---------------------+
| dify怎么用 | Accepted | Difyの使用方法 - **クラウド | README_JA.md,CONTRI |
| | | </br>** | BUTING_CN.md,CONTRI |
| | | [こちら](https://dify.a | BUTING_VI.md |
| | | i)のdify cloudサービスを利用 | |
| | | して、セットアップ不要で試すことができま | |
| | | す。サンドボックスプラン.. | |
+----------------------+----------+----------------------+---------------------+
| dify是免费的吗 | Accepted | 使用 Dify - **云 | README_CN.md,README |
| | | </br>** | _KL.md |
| | | 我们提供[ dify 云服务](http | |
| | | s://dify.ai),任何人都可以零 | |
| | | 设置尝试。它提供了自部署版本的所有功能, | |
| | | 并在沙盒计划中包含 200 次免费.. | |
+----------------------+----------+----------------------+---------------------+
| 介绍一下dify | Accepted | </p> | README.md,README_JA |
| | | dify is an open- | .md,README_VI.md,RE |
| | | source llm app | ADME_ES.md |
| | | development | |
| | | platform. its | |
| | | intuitive interface | |
| | | combines ai wor.. | |
+----------------------+----------+----------------------+---------------------+
| dify是哪家公司的 | Accepted | <p align="center"> | README_JA.md,README |
| | | <a href="https://tre | _KL.md,README_CN.md |
| | | ndshift.io/repositor | ,invite_member_mail |
| | | ies/2152" | _template_zh- |
| | | target="_blank"><img | CN.html |
| | | src="http.. | |
+----------------------+----------+----------------------+---------------------+
| dify是开源的嘛 | Accepted | </p> | README.md,README_JA |
| | | dify is an open- | .md,README_CN.md,RE |
| | | source llm app | ADME_KL.md |
| | | development | |
| | | platform. its | |
| | | intuitive interface | |
| | | combines ai wor.. | |
+----------------------+----------+----------------------+---------------------+
| 怎么搭建大模型workflow | Accepted | 配置规则 ModelFeature - | schema.md,customiza |
| | | `agent-thought` | ble_model_scale_out |
| | | agent 推理,一般超过 70b | .md,provider_scale_ |
| | | 有思维链能力。 | out.md |
| | | - `vision` | |
| | | 视觉,即:图像理解。 | |
| | | - `tool-call` 工具.. | |
+----------------------+----------+----------------------+---------------------+
| 怎么搭建大模型工作流 | Accepted | 配置规则 ModelFeature - | schema.md,customiza |
| | | `agent-thought` | ble_model_scale_out |
| | | agent 推理,一般超过 70b | .md,README_CN.md,pr |
| | | 有思维链能力。 | ovider_scale_out.md |
| | | - `vision` | |
| | | 视觉,即:图像理解。 | |
| | | - `tool-call` 工具.. | |
+----------------------+----------+----------------------+---------------------+
| dify搭建工作流要怎么做 | Accepted | クイックスタート > difyをインス | README_JA.md,README |
| | | トールする前に、お使いのマシンが以下の最 | _VI.md,README_KL.md |
| | | 小システム要件を満たしていることを確認し | ,README_CN.md |
| | | てください: | |
| | | > | |
| | | >- cpu >= 2コア | |
| | | >- ram >= 4gb | |
| | | <.. | |
+----------------------+----------+----------------------+---------------------+
| 不懂代码可以做大模型应用嘛 | Accepted | 配置规则 ModelFeature - | schema.md |
| | | `agent-thought` | |
| | | agent 推理,一般超过 70b | |
| | | 有思维链能力。 | |
| | | - `vision` | |
| | | 视觉,即:图像理解。 | |
| | | - `tool-call` 工具.. | |
+----------------------+----------+----------------------+---------------------+
| dify怎么搭建workflow | Accepted | Tools このモジュールは、difyの | README_JP.md,README |
| | | エージェントアシスタントやワークフローで | _JA.md,README.md,CO |
| | | 使用される組み込みツールを実装しています | NTRIBUTING_JA.md |
| | | 。このモジュールでは、フロントエンドのロ | |
| | | ジックを変更することなく、独自のツールを | |
| | | .. | |
+----------------------+----------+----------------------+---------------------+
| what is dify | Accepted | </p> | README.md,README_JA |
| | | dify is an open- | .md,README_VI.md,RE |
| | | source llm app | ADME_KL.md |
| | | development | |
| | | platform. its | |
| | | intuitive interface | |
| | | combines ai wor.. | |
+----------------------+----------+----------------------+---------------------+
| 如何在本地部署Dify? | Accepted | 빠른 시작 >dify를 설치하기 | README_KR.md,CONTRI |
| | | 전에 컴퓨터가 다음과 같은 최소 | BUTING_VI.md,README |
| | | 시스템 요구 사항을 충족하는지 | _VI.md,CONTRIBUTING |
| | | 확인하세요 : | _CN.md |
| | | >- cpu >= 2 core | |
| | | >- ram >= 4gb | |
| | | </br>.. | |
+----------------------+----------+----------------------+---------------------+
| 什么是RAG引擎,Dify是如何使用它的 | Accepted | <div | README_CN.md |
| ? | | align="center"> | |
| | | <a href="https://tre | |
| | | ndshift.io/repositor | |
| | | ies/2152" | |
| | | target="_blank"><img | |
| | | src="ht.. | |
+----------------------+----------+----------------------+---------------------+
| Dify的API和应用程序导向有什么区别 | Accepted | 功能比较 <table | README_CN.md,README |
| | | style="width: | _JA.md,README_KL.md |
| | | 100%;"> | |
| | | <tr> | |
| | | <th align="center">功 | |
| | | 能</th> | |
| | | <th align="center">d | |
| | | ify.ai</th> | |
| | | <.. | |
+----------------------+----------+----------------------+---------------------+
| Dify的价格是多少,如何获取商业许可 | Accepted | Difyの使用方法 - **クラウド | README_JA.md,README |
| | | </br>** | _AR.md,README_CN.md |
| | | [こちら](https://dify.a | ,README_KR.md |
| | | i)のdify cloudサービスを利用 | |
| | | して、セットアップ不要で試すことができま | |
| | | す。サンドボックスプラン.. | |
+----------------------+----------+----------------------+---------------------+
| Dify的Agent是什么,它有什么作用 | Accepted | ![providers- | README_JA.md,README |
| | | v5](https://github.c | _KL.md,README_VI.md |
| | | om/langgenius/dify/a | ,README_CN.md |
| | | ssets/13230914/5a17b | |
| | | dbe-097a-4100-8363- | |
| | | 40255b70.. | |
+----------------------+----------+----------------------+---------------------+
问答效果测试
没有rag时,让internLM2.5-7b-chat回答效果
很明显大模型训练时还不知道dify这个好东西,无法回答。
(xtuner) root@intern-studio-50006073:~# lmdeploy chat /share/new_models/Shanghai_AI_Laboratory/internlm2_5-7b-chat --model-format hf
chat_template_config:
ChatTemplateConfig(model_name='internlm2', system=None, meta_instruction=None, eosys=None, user=None, eoh=None, assistant=None, eoa=None, separator=None, capability='chat', stop_words=None)
engine_cfg:
TurbomindEngineConfig(model_name='/share/new_models/Shanghai_AI_Laboratory/internlm2_5-7b-chat', model_format='hf', tp=1, session_len=32768, max_batch_size=1, cache_max_entry_count=0.8, cache_block_seq_len=64, enable_prefix_caching=False, quant_policy=0, rope_scaling_factor=0.0, use_logn_attn=False, download_dir=None, revision=None, max_prefill_token_num=8192, num_tokens_per_iter=0, max_prefill_iters=1)
[WARNING] gemm_config.in is not found; using default GEMM algo
double enter to end input >>> 请告诉我dify怎么用?
<|im_start|>system
You are an AI assistant whose name is InternLM (书生·浦语).
- InternLM (书生·浦语) is a conversational language model that is developed by Shanghai AI Laboratory (上海人工智能实验室). It is designed to be helpful, honest, and harmless.
- InternLM (书生·浦语) can understand and communicate fluently in the language chosen by the user such as English and 中文.
<|im_end|>
<|im_start|>user
请告诉我dify怎么用?<|im_end|>
<|im_start|>assistant
对不起,我无法找到与“dify”相关的明确信息。如果您指的是某个特定的软件、应用程序或服务,请提供更多的上下文信息,以便我能提供更准确的帮助。如果您有关于编程、语言学习、科技新闻等方面的问题,我也非常乐意为您提供帮助。
double enter to end input >>> dify是开源的嘛
<|im_start|>user
dify是开源的嘛<|im_end|>
<|im_start|>assistant
对不起,我无法找到与“dify”相关的明确信息。如果您指的是某个特定的软件、应用程序或服务,请提供更多的上下文信息,以便我能提供更准确的帮助。如果您有关于编程、语言学习、科技新闻等方面的问题,我也非常乐意为您提供帮助。
有rag时,让internLM2.5-7b-chat回答效果
有了rag后就能很轻松的答对啦!
# cli测试
# python3 -m huixiangdou.main --standalone
# gradio页面测试
python3 -m huixiangdou.gradio
# ssh -CNg -L 7860:127.0.0.1:7860 root@ssh.intern-ai.org.cn -p <port>
# http://127.0.0.1:7860/
可以看到,知识库里面是把图片也都处理了的,回答的时候也能带出图片,非常秀了~