2、huixiangdou仓里的requirements.txt有好多好多依赖,重点记得transformers 要>=4.38就好)

cd /root
git clone https://github.com/internlm/huixiangdou && cd huixiangdou
git checkout 79fa810

apt update
apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev
# python requirements
pip install BCEmbedding==0.1.5 cmake==3.30.2 lit==18.1.8 sentencepiece==0.2.0 protobuf==5.27.3 accelerate==0.33.0
pip install -r requirements.txt
# python3.8 安装 faiss-gpu 而不是 faiss



cd /root/huixiangdou && mkdir repodir

git clone https://github.com/internlm/huixiangdou --depth=1 repodir/huixiangdou
git clone https://github.com/open-mmlab/mmpose    --depth=1 repodir/mmpose

# Save the features of repodir to workdir, and update the positive and negative example thresholds into `config.ini`
mkdir workdir
python3 -m huixiangdou.service.feature_store
/root/.conda/envs/xtuner/lib/python3.10/runpy.py:126: RuntimeWarning: 'huixiangdou.service.feature_store' found in sys.modules after import of package 'huixiangdou.service', but prior to execution of 'huixiangdou.service.feature_store'; this may result in unpredictable behaviour
2024-10-06 19:47:30.986 | INFO     | huixiangdou.service.retriever:__init__:262 - loading test2vec and rerank models
/root/.conda/envs/xtuner/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
10/06/2024 19:48:24 - [INFO] -BCEmbedding.models.RerankerModel->>>    Loading from `/root/models/bce-reranker-base_v1`.
10/06/2024 19:48:24 - [INFO] -BCEmbedding.models.RerankerModel->>>    Execute device: cuda;	 gpu num: 1;	 use fp16: True
2024-10-06 19:48:25.063 | DEBUG    | __main__:__init__:67 - loading text2vec model..
2024-10-06 19:48:25.064 | INFO     | __main__:__init__:83 - init dense retrieval database with chunk_size 900
2024-10-06 19:48:50.761 | INFO     | __main__:initialize:250 - initialize response and reject feature store, you only need call this once.
2024-10-06 19:48:55.795 | INFO     | __main__:read_and_save:31 - reading repodir2/dify/api/templates/invite_member_mail_template_en-US.html, would save to workdir/preprocess/0059f8d4.text
2024-10-06 19:48:55.800 | INFO     | __main__:read_and_save:31 - reading repodir2/dify/api/templates/invite_member_mail_template_zh-CN.html, would save to workdir/preprocess/1bfc6b42.text
2024-10-06 19:48:55.807 | INFO     | __main__:read_and_save:31 - reading repodir2/dify/api/templates/reset_password_mail_template_en-US.html, would save to workdir/preprocess/56080a6a.text
2024-10-06 19:48:55.815 | INFO     | __main__:read_and_save:31 - reading repodir2/dify/api/templates/reset_password_mail_template_zh-CN.html, would save to workdir/preprocess/04929e9e.text
2024-10-06 19:48:56.277 | DEBUG    | __main__:preprocess:230 - waiting for file preprocess finish..
Token indices sequence length is longer than the specified maximum sequence length for this model (523 > 512). Running this sequence through the model will result in indexing errors
2024-10-06 19:48:57.267 | INFO     | __main__:analyze:170 - text histogram, length count 1186, avg 664.15, median 870
0-94  5.73%
94-188  7.67%
188-282  6.24%
282-376  4.38%
376-470  3.71%
470-564  3.63%
564-658  3.79%
658-752  4.72%
752-846  7.34%
846-940  52.02%
940-1034  0.76%

2024-10-06 19:48:57.268 | INFO     | __main__:analyze:171 - token histogram, length count 1186, avg 283.89, median 315
0-52  9.44%
52-104  11.21%
104-156  7.76%
156-208  6.49%
208-260  7.84%
260-312  7.17%
312-364  4.22%
364-416  8.01%
416-468  36.34%
468-520  1.43%
520-572  0.08%

10/06/2024 19:48:57 - [INFO] -faiss.loader->>>    Loading faiss with AVX2 support.
10/06/2024 19:48:57 - [INFO] -faiss.loader->>>    Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
10/06/2024 19:48:57 - [INFO] -faiss.loader->>>    Loading faiss.
10/06/2024 19:48:57 - [INFO] -faiss.loader->>>    Successfully loaded faiss.

2024-10-06 19:49:07.886 | INFO     | huixiangdou.primitive.file_operation:summarize:143 - 累计181文件,成功57个,跳过0个,异常124个
2024-10-06 19:49:08.694 | INFO     | huixiangdou.service.retriever:update_throttle:82 - The optimal threshold is: 0.3854840287267922, saved it to config.ini
2024-10-06 19:49:09.274 | WARNING  | __main__:test_reject:320 - process query: dify是什么
2024-10-06 19:49:09.298 | WARNING  | __main__:test_reject:320 - process query: dify怎么用
2024-10-06 19:49:09.326 | WARNING  | __main__:test_reject:320 - process query: dify是免费的吗
2024-10-06 19:49:09.342 | WARNING  | __main__:test_reject:320 - process query: 介绍一下dify
2024-10-06 19:49:09.357 | WARNING  | __main__:test_reject:320 - process query: dify是哪家公司的
2024-10-06 19:49:09.371 | WARNING  | __main__:test_reject:320 - process query: dify是开源的嘛
2024-10-06 19:49:09.392 | WARNING  | __main__:test_reject:320 - process query: 怎么搭建大模型workflow
2024-10-06 19:49:09.408 | WARNING  | __main__:test_reject:320 - process query: 怎么搭建大模型工作流
2024-10-06 19:49:09.423 | WARNING  | __main__:test_reject:320 - process query: dify搭建工作流要怎么做
2024-10-06 19:49:09.446 | WARNING  | __main__:test_reject:320 - process query: 不懂代码可以做大模型应用嘛
2024-10-06 19:49:09.467 | WARNING  | __main__:test_reject:320 - process query: dify怎么搭建workflow
2024-10-06 19:49:09.482 | WARNING  | __main__:test_reject:320 - process query: what is dify
2024-10-06 19:49:09.497 | WARNING  | __main__:test_reject:320 - process query: 如何在本地部署Dify?
2024-10-06 19:49:09.514 | WARNING  | __main__:test_reject:320 - process query: 什么是RAG引擎,Dify是如何使用它的?
2024-10-06 19:49:09.529 | WARNING  | __main__:test_reject:320 - process query: Dify的API和应用程序导向有什么区别
2024-10-06 19:49:09.543 | WARNING  | __main__:test_reject:320 - process query: Dify的价格是多少,如何获取商业许可
2024-10-06 19:49:09.557 | WARNING  | __main__:test_reject:320 - process query: Dify的Agent是什么,它有什么作用

2024-10-06 19:49:14.447 | INFO     | __main__:test_query:360 - 
|        Query         |  State   |    Part of Chunks    |     References      |
| dify是什么           | Accepted |  </p>                | README.md,README_JA |
|                      |          | dify is an open-     | .md,README_VI.md,RE |
|                      |          | source llm app       | ADME_CN.md          |
|                      |          | development          |                     |
|                      |          | platform. its        |                     |
|                      |          | intuitive interface  |                     |
|                      |          | combines ai wor..    |                     |
| dify怎么用           | Accepted |  Difyの使用方法 - **クラウド | README_JA.md,CONTRI |
|                      |          | </br>**              | BUTING_CN.md,CONTRI |
|                      |          | [こちら](https://dify.a | BUTING_VI.md        |
|                      |          | i)のdify cloudサービスを利用 |                     |
|                      |          | して、セットアップ不要で試すことができま |                     |
|                      |          | す。サンドボックスプラン.. |                     |
| dify是免费的吗       | Accepted |  使用 Dify - **云    | README_CN.md,README |
|                      |          | </br>**              | _KL.md              |
|                      |          | 我们提供[ dify 云服务](http |                     |
|                      |          | s://dify.ai),任何人都可以零 |                     |
|                      |          | 设置尝试。它提供了自部署版本的所有功能, |                     |
|                      |          | 并在沙盒计划中包含 200 次免费.. |                     |
| 介绍一下dify         | Accepted |  </p>                | README.md,README_JA |
|                      |          | dify is an open-     | .md,README_VI.md,RE |
|                      |          | source llm app       | ADME_ES.md          |
|                      |          | development          |                     |
|                      |          | platform. its        |                     |
|                      |          | intuitive interface  |                     |
|                      |          | combines ai wor..    |                     |
| dify是哪家公司的     | Accepted |  <p align="center">  | README_JA.md,README |
|                      |          | <a href="https://tre | _KL.md,README_CN.md |
|                      |          | ndshift.io/repositor | ,invite_member_mail |
|                      |          | ies/2152"            | _template_zh-       |
|                      |          | target="_blank"><img | CN.html             |
|                      |          | src="http..          |                     |
| dify是开源的嘛       | Accepted |  </p>                | README.md,README_JA |
|                      |          | dify is an open-     | .md,README_CN.md,RE |
|                      |          | source llm app       | ADME_KL.md          |
|                      |          | development          |                     |
|                      |          | platform. its        |                     |
|                      |          | intuitive interface  |                     |
|                      |          | combines ai wor..    |                     |
| 怎么搭建大模型workflow | Accepted | 配置规则 ModelFeature - | schema.md,customiza |
|                      |          | `agent-thought`      | ble_model_scale_out |
|                      |          | agent 推理,一般超过 70b | .md,provider_scale_ |
|                      |          | 有思维链能力。       | out.md              |
|                      |          | - `vision`           |                     |
|                      |          | 视觉,即:图像理解。 |                     |
|                      |          | - `tool-call` 工具.. |                     |
| 怎么搭建大模型工作流 | Accepted | 配置规则 ModelFeature - | schema.md,customiza |
|                      |          | `agent-thought`      | ble_model_scale_out |
|                      |          | agent 推理,一般超过 70b | .md,README_CN.md,pr |
|                      |          | 有思维链能力。       | ovider_scale_out.md |
|                      |          | - `vision`           |                     |
|                      |          | 视觉,即:图像理解。 |                     |
|                      |          | - `tool-call` 工具.. |                     |
| dify搭建工作流要怎么做 | Accepted |  クイックスタート > difyをインス | README_JA.md,README |
|                      |          | トールする前に、お使いのマシンが以下の最 | _VI.md,README_KL.md |
|                      |          | 小システム要件を満たしていることを確認し | ,README_CN.md       |
|                      |          | てください:         |                     |
|                      |          | >                    |                     |
|                      |          | >- cpu >= 2コア      |                     |
|                      |          | >- ram >= 4gb        |                     |
|                      |          | <..                  |                     |
| 不懂代码可以做大模型应用嘛 | Accepted | 配置规则 ModelFeature - | schema.md           |
|                      |          | `agent-thought`      |                     |
|                      |          | agent 推理,一般超过 70b |                     |
|                      |          | 有思维链能力。       |                     |
|                      |          | - `vision`           |                     |
|                      |          | 视觉,即:图像理解。 |                     |
|                      |          | - `tool-call` 工具.. |                     |
| dify怎么搭建workflow | Accepted | Tools このモジュールは、difyの | README_JP.md,README |
|                      |          | エージェントアシスタントやワークフローで | _JA.md,README.md,CO |
|                      |          | 使用される組み込みツールを実装しています | NTRIBUTING_JA.md    |
|                      |          | 。このモジュールでは、フロントエンドのロ |                     |
|                      |          | ジックを変更することなく、独自のツールを |                     |
|                      |          | ..                   |                     |
| what is dify         | Accepted |  </p>                | README.md,README_JA |
|                      |          | dify is an open-     | .md,README_VI.md,RE |
|                      |          | source llm app       | ADME_KL.md          |
|                      |          | development          |                     |
|                      |          | platform. its        |                     |
|                      |          | intuitive interface  |                     |
|                      |          | combines ai wor..    |                     |
| 如何在本地部署Dify? | Accepted |  빠른 시작 >dify를 설치하기 | README_KR.md,CONTRI |
|                      |          | 전에 컴퓨터가 다음과 같은 최소 | BUTING_VI.md,README |
|                      |          | 시스템 요구 사항을 충족하는지 | _VI.md,CONTRIBUTING |
|                      |          | 확인하세요 :         | _CN.md              |
|                      |          | >- cpu >= 2 core     |                     |
|                      |          | >- ram >= 4gb        |                     |
|                      |          | </br>..              |                     |
| 什么是RAG引擎,Dify是如何使用它的 | Accepted |  <div                | README_CN.md        |
| ?                   |          | align="center">      |                     |
|                      |          | <a href="https://tre |                     |
|                      |          | ndshift.io/repositor |                     |
|                      |          | ies/2152"            |                     |
|                      |          | target="_blank"><img |                     |
|                      |          | src="ht..            |                     |
| Dify的API和应用程序导向有什么区别 | Accepted |  功能比较 <table     | README_CN.md,README |
|                      |          | style="width:        | _JA.md,README_KL.md |
|                      |          | 100%;">              |                     |
|                      |          | <tr>                 |                     |
|                      |          | <th align="center">|                     |
|                      |          |</th>              |                     |
|                      |          | <th align="center">d |                     |
|                      |          | ify.ai</th>          |                     |
|                      |          | <..                  |                     |
| Dify的价格是多少,如何获取商业许可 | Accepted |  Difyの使用方法 - **クラウド | README_JA.md,README |
|                      |          | </br>**              | _AR.md,README_CN.md |
|                      |          | [こちら](https://dify.a | ,README_KR.md       |
|                      |          | i)のdify cloudサービスを利用 |                     |
|                      |          | して、セットアップ不要で試すことができま |                     |
|                      |          | す。サンドボックスプラン.. |                     |
| Dify的Agent是什么,它有什么作用 | Accepted |  ![providers-        | README_JA.md,README |
|                      |          | v5](https://github.c | _KL.md,README_VI.md |
|                      |          | om/langgenius/dify/a | ,README_CN.md       |
|                      |          | ssets/13230914/5a17b |                     |
|                      |          | dbe-097a-4100-8363-  |                     |
|                      |          | 40255b70..           |                     |




(xtuner) root@intern-studio-50006073:~# lmdeploy chat /share/new_models/Shanghai_AI_Laboratory/internlm2_5-7b-chat --model-format hf
ChatTemplateConfig(model_name='internlm2', system=None, meta_instruction=None, eosys=None, user=None, eoh=None, assistant=None, eoa=None, separator=None, capability='chat', stop_words=None)
TurbomindEngineConfig(model_name='/share/new_models/Shanghai_AI_Laboratory/internlm2_5-7b-chat', model_format='hf', tp=1, session_len=32768, max_batch_size=1, cache_max_entry_count=0.8, cache_block_seq_len=64, enable_prefix_caching=False, quant_policy=0, rope_scaling_factor=0.0, use_logn_attn=False, download_dir=None, revision=None, max_prefill_token_num=8192, num_tokens_per_iter=0, max_prefill_iters=1)
[WARNING] gemm_config.in is not found; using default GEMM algo                                                                                     

double enter to end input >>> 请告诉我dify怎么用?

You are an AI assistant whose name is InternLM (书生·浦语).
- InternLM (书生·浦语) is a conversational language model that is developed by Shanghai AI Laboratory (上海人工智能实验室). It is designed to be helpful, honest, and harmless.
- InternLM (书生·浦语) can understand and communicate fluently in the language chosen by the user such as English and 中文.

double enter to end input >>> dify是开源的嘛




# cli测试
# python3 -m huixiangdou.main --standalone

# gradio页面测试
python3 -m huixiangdou.gradio
# ssh -CNg -L 7860: root@ssh.intern-ai.org.cn -p <port>






