Dify源码本地部署启动

背景

Dify是一个开源LLM应用程序开发平台。Dify的直观界面结合了人工智能工作流、RAG管道、代理功能、模型管理、可观察性功能等，让您快速从原型到生产。

Dify提供在线试用功能，可以直接在线体验其功能。同时也支持docker部署，源码部署等方式。源码部署可以查看Dify的实现细节，并进行定制化改造。本次记录源码部署遇到的问题和解决方案。

前置准备

由于是源码部署，还要对Dify进行改造，所以在Windows系统进行部署。本次部署使用win11系统。

Dify官网建议源码在linux系统下启动，所以需要在Windows下安装WSL2，启动linux子系统。本次安装WSL2使用的是Ubuntu 20.04.6系统。

同时需要在Windows系统安装Docker Desktop。点击下载

上面下载链接国内可能打不开，如果打不开需要自己找Docker Desktop安装包进行安装。本次使用的是4.31.1版本Docker Desktop。下载后正常安装Docker Desktop即可。然后注册账号进行登录。后续需要在Docker Desktop上拉取镜像。

部署过程

1. 拉取源码

在Windows系统，拉取源码即可:

git clone https://github.com/langgenius/dify.git

2. 拉取必要镜像

首先，打开拉取的Dify源码代码，在docker文件夹中，打开docker-compose.middleware.yaml文件，看里面定义的镜像已经版本。
包括:

image: postgres:15-alpine
image: redis:6-alpine
image: semitechnologies/weaviate(此处注意，官网定义的版本在Docker Desktop中不能拉取到，所以把版本去掉了，拉取最新版本镜像)
image: langgenius/dify-sandbox:0.2.1
image: ubuntu/squid:latest

在Docker Desktop中搜索上面镜像，点击pull进行拉取(不要使用Docker Desktop启动镜像)，如下图所示:

在这里插入图片描述

3. 启动容器

通过WSL2系统进入到下载的Dify源码文件夹中，进入docker 文件夹，使用以下命令启动容器:

docker compose -f docker-compose.middleware.yaml up -d

注:Windows系统安装了Docker Desktop后，WSL2系统也可以使用docker命令。

这里遇到一个坑，安装官网操作，上述命令应该能正常启动docker容器。但是在实际操作中postgres容器启动报错。报错信息是:

initdb: error: could not change permissions of directory "/var/lib/postgresql/data/pgdata"

没有操作/var/lib/postgresql/data/pgdata的权限。通过查看docker-compose.middleware.yaml中的定义，如下图:
在这里插入图片描述
在PGDATA和volumes中定义了此路径。

这里的解决方案是绕了一个弯解决此问题，因为我是用源码启动进行Dify源码的学习，所以数据是否要挂载出来并没有太大影响，所以我选择不进行此路径的挂载。同时，PGDATA的系统变量我也不再进行设置，而是使用它的默认值。

这里，我用Docker Desktop来启动postgresql镜像，并没有使用docker compose来启动。

首先，把WSL中使用docker compose启动的postgresql容器stop，然后将其rm删掉。因为这个容器一直报错一直重启，无法正常使用。然后在Windows的Docker Desktop下，启动postgresql镜像，并按照docker-compose.middleware.yaml中的配置来设置启动参数(volumes和PGDATA除外)，如下图所示:
在这里插入图片描述

这样可以正常启动postgres容器，且在WSL2中也可以正常使用。

4. 启动后台服务

后台服务包括一个api service，一个Worker Asynchronous Queue Consumption Service。需要启动这两个服务。

首先需要在WSL2中安装Anaconda，安装方式此处不再进行赘述。
同时需要创建虚拟空间，如下命令所示:

conda create --name dify python=3.10

同时切换到此虚拟环境:

conda activate dify

然后进行以下操作:

在WSL2中进入Dify源码的api文件夹下，配置.env文件:

cp .env.example .env

生成SECRET_KEY：

openssl rand -base64 42

把生成的key复制到.env文件里的SECRET_KEY后面。
安装api服务需要的python依赖:

pip install -r requirements.txt

初始化postgres表数据:

flask db upgrade

**注意:**执行到这里时，出现了报错，具体报错博主没有进行记录。大概错误也是少python包。根据报错提示，pip install对应的包，再执行此命令，就可以成功。

启动api服务:

flask run --host 0.0.0.0 --port=5001 --debug

输入以下字样代表成功:

Debug mode: on INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server
instead. * Running on all addresses (0.0.0.0) * Running on
http://127.0.0.1:5001 INFO:werkzeug:Press CTRL+C to quit
INFO:werkzeug: * Restarting with stat WARNING:werkzeug: * Debugger is
active! INFO:werkzeug: * Debugger PIN: 695-801-919

启动Worker service服务

重新打开一个WSL2终端，切换到dify虚拟环境，在cd到Dify源码的api文件夹下，执行下面命令:

celery -A app.celery worker -P solo --without-gossip --without-mingle -Q dataset,generation,mail --loglevel INFO

输出以下字样代表启动成功:

-------------- celery@TAKATOST.lan v5.2.7 (dawn-chorus)
— ***** -----
– ******* ---- macOS-10.16-x86_64-i386-64bit 2023-07-31 12:58:08

*** — * —
** ---------- [config]
** ---------- .> app: app:0x7fb568572a10
** ---------- .> transport: redis://😗*@localhost:6379/1
** ---------- .> results: postgresql://postgres:**@localhost:5432/dify
*** — * — .> concurrency: 1 (gevent)
– ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
— ***** ----- -------------- [queues]
.> dataset exchange=dataset(direct) key=dataset
.> generation exchange=generation(direct) key=generation
.> mail exchange=mail(direct) key=mail

[tasks] .
tasks.add_document_to_index_task.add_document_to_index_task .
tasks.clean_dataset_task.clean_dataset_task .
tasks.clean_document_task.clean_document_task .
tasks.clean_notion_document_task.clean_notion_document_task .
tasks.create_segment_to_index_task.create_segment_to_index_task .
tasks.deal_dataset_vector_index_task.deal_dataset_vector_index_task
. tasks.document_indexing_sync_task.document_indexing_sync_task .
tasks.document_indexing_task.document_indexing_task .
tasks.document_indexing_update_task.document_indexing_update_task .
tasks.enable_segment_to_index_task.enable_segment_to_index_task .
tasks.generate_conversation_summary_task.generate_conversation_summary_task
. tasks.mail_invite_member_task.send_invite_member_mail_task .
tasks.remove_document_from_index_task.remove_document_from_index_task
. tasks.remove_segment_from_index_task.remove_segment_from_index_task
. tasks.update_segment_index_task.update_segment_index_task .
tasks.update_segment_keyword_index_task.update_segment_keyword_index_task

[2023-07-31 12:58:08,831: INFO/MainProcess] Connected to
redis://:@localhost:6379/1 [2023-07-31 12:58:08,840:
INFO/MainProcess] mingle: searching for neighbors [2023-07-31
12:58:09,873: INFO/MainProcess] mingle: all alone [2023-07-31
12:58:09,886: INFO/MainProcess] pidbox: Connected to
redis://:@localhost:6379/1. [2023-07-31 12:58:09,890:
INFO/MainProcess] celery@TAKATOST.lan ready.

5. 启动前台服务

重新打开一个WSL2终端，进入Dify源码的web文件夹下，进行以下操作:

安装node以及npm。
需要Node.js v18.x (LTS) 以上和 NPM version 8.x.x以上。
安装命令如下:

# installs nvm (Node Version Manager)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash

# download and install Node.js (you may need to restart the terminal)
nvm install 18

# verifies the right Node.js version is in the environment
node -v # should print `v18.20.3`

# verifies the right NPM version is in the environment
npm -v # should print `10.7.0`

下载前端依赖包:

npm install

复制web文件夹下的.env.example文件，并将复制的文件重命名为.env.local。里面内容无需改动。
构建前端代码:

npm run build

启动前端服务:

npm run start

当出现以下字样时，代表启动成功:

ready - started server on 0.0.0.0:3000, url: http://localhost:3000
warn - You have enabled experimental feature (appDir) in
next.config.js. warn - Experimental features are not covered by
semver, and may cause unexpected or broken application behavior. Use
at your own risk. info - Thank you for testing appDir please leave
your feedback at https://nextjs.link/app-feedback