DeepSeek 3FS集群化部署临时笔记
- 一、3FS集群化部署
- 1、环境介绍
- 2、对应的软件包安装
- 3、编译
- 4、部署
- 4.1 部署monitor_collector_main
- Step 2: Admin client
- Step 3: Mgmtd service
- Step 4: Meta service
- Step 5: Storage service
- Step 6: Create admin user, storage targets and chain table
- Step 7: FUSE client
- 5、遇到的问题
- 6、部分命令解释
- 6.1 第一条命令
- 参数解析
- 6.2 第二条命令
- 参数解析
- 第一条命令 (`data_placement.py`)
- 第二条命令 (`gen_chain_table.py`)
一、3FS集群化部署
官方文档和issue:
https://github.com/deepseek-ai/3FS/issues.
https://github.com/deepseek-ai/3FS/blob/main/deploy/README.md
1、环境介绍
节点 | 管理IP | 25G_IP | OS | 服务 | 说明 |
---|---|---|---|---|---|
3fs-node-meta001 | 172.20.99.94 | 11.12.63.55 | ubuntu 22.04 | mgmtd_main.service、meta_main.service、hf3fs_fuse_main.service、foundationdb、clickhouse_server | 需要配置admin_cli |
3fs-node-meta001 | 172.20.99.96 | 11.12.63.54 | ubuntu 22.04 | storage_main.service | 需要配置admin_cli |
3fs-node-meta001 | 172.20.99.121 | 11.12.63.57 | ubuntu 22.04 | storage_main.service | 需要配置admin_cli |
3fs-node-meta001 | 172.20.99.122 | 11.12.63.58 | ubuntu 22.04 | storage_main.service | 需要配置admin_cli |
RDMA Configuration
Assign IP addresses to RDMA NICs. Multiple RDMA NICs (InfiniBand or RoCE) are supported on each node.
Check RDMA connectivity between nodes using ib_write_bw.
部署说明:
- 端口冲突 : 由于我是mgmtd服务和clickhost_server一起部署,会导致存在9000端口冲突,导致mgmtd无法启动问题
- 解决方法: 需要把clickhouse_server配置文件中的9000端口调整下,比如我这里调整为6000
root@3fs-node-meta01:~# netstat -antulp | grep 6000 | head
tcp 0 0 172.20.99.94:50876 172.20.99.94:6000 ESTABLISHED 154445/monitor_coll
tcp 0 0 172.20.99.94:50852 172.20.99.94:6000 ESTABLISHED 154445/monitor_coll
tcp 0 0 172.20.99.94:50904 172.20.99.94:6000 ESTABLISHED 154445/monitor_coll
tcp 0 0 172.20.99.94:50926 172.20.99.94:6000 ESTABLISHED 154445/monitor_coll
tcp 0 0 172.20.99.94:50966 172.20.99.94:6000 ESTABLISHED 154445/monitor_coll
root@3fs-node-meta01:/etc/clickhouse-server# ls
config.d config.xml users.d users.xml
root@3fs-node-meta01:/etc/clickhouse-server# pwd
/etc/clickhouse-server
root@3fs-node-meta01:/etc/clickhouse-server# grep -rn 6000 *
config.xml:104: <tcp_port>6000</tcp_port>
config.xml:721: <port>6000</port>
config.xml:732: <port>6000</port>
config.xml:736: <port>6000</port>
config.xml:740: <port>6000</port>
config.xml:747: <port>6000</port>
config.xml:751: <port>6000</port>
config.xml:755: <port>6000</port>
config.xml:763: <port>6000</port>
config.xml:769: <port>6000</port>
config.xml:777: <port>6000</port>
config.xml:783: <port>6000</port>
config.xml:792: <port>6000</port>
config.xml:799: <port>6000</port>
config.xml:816: <port>6000</port>
# 修改完端口后,重启clickhouse-server服务,之后查看状态和坚挺的端口,我这边已经修改完毕并重启了服务
root@3fs-node-meta01:/etc/clickhouse-server# systemctl status clickhouse-server.service
● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
Loaded: loaded (/lib/systemd/system/clickhouse-server.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2025-03-16 15:01:54 CST; 3h 57min ago
Main PID: 143832 (clckhouse-watch)
Tasks: 249 (limit: 154032)
Memory: 3.1G
CPU: 2h 55min 31.779s
CGroup: /system.slice/clickhouse-server.service
├─143832 clickhouse-watchdog "" "" "" "" "" "" "" --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-ser>
└─143833 /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse>
Mar 16 15:01:54 3fs-node-meta01 systemd[1]: Started ClickHouse Server (analytic DBMS for big data).
Mar 16 15:01:54 3fs-node-meta01 clickhouse-server[143832]: Processing configuration file '/etc/clickhouse-server/config.xml'.
Mar 16 15:01:54 3fs-node-meta01 clickhouse-server[143832]: Logging trace to /var/log/clickhouse-server/clickhouse-server.log
Mar 16 15:01:54 3fs-node-meta01 clickhouse-server[143832]: Logging errors to /var/log/clickhouse-server/clickhouse-server.err.log
Mar 16 15:01:55 3fs-node-meta01 clickhouse-server[143833]: Processing configuration file '/etc/clickhouse-server/config.xml'.
Mar 16 15:01:55 3fs-node-meta01 clickhouse-server[143833]: Saved preprocessed configuration to '/var/lib/clickhouse/preprocessed_configs>
Mar 16 15:01:55 3fs-node-meta01 clickhouse-server[143833]: Processing configuration file '/etc/clickhouse-server/users.xml'.
Mar 16 15:01:55 3fs-node-meta01 clickhouse-server[143833]: Saved preprocessed configuration to '/var/lib/clickhouse/preprocessed_configs>
lines 1-19/19 (END)
- 网络配置 :由于我的环境是存在管理口和25GB网络的,但是我看了3fs的源代码发现并不能识别出bond设备,所以导致在安装的时候一直提示网络问题。也尝试需改了对应的配置文件,指定了设备,但是依然不行
root@3fs-node-meta01:/opt/3fs/etc# cat monitor_collector_main.toml | grep filter_list -C 5
[server.base.groups.io_worker.transport_pool]
max_connections = 1
[server.base.groups.listener]
filter_list = ['eno4'] # 默认配置这里是空的,但是我尝试配置了bond0、bond1即filter_list = ['bond0']、filter_list = ['bond1']均无法识别
listen_port = 10000
listen_queue_depth = 4096
rdma_listen_ethernet = true
reuse_port = false
# 看了对应的错误日志,发现对应的代码并没有解析bond设备
[2025-03-14T22:18:34.450205338+08:00 monitor_collect:416868 Listener.cc:87 ERROR] No available address for listener with network type: TCP, filter list:
[2025-03-14T22:18:34.450235037+08:00 monitor_collect:416868 ServiceGroup.cc:26 ERROR] error: RPC::ListenFailed(2011)
[2025-03-14T22:18:34.450243611+08:00 monitor_collect:416868 Server.cc:27 ERROR] Setup group (MonitorCollector) failed: RPC::ListenFailed(2011)
[2025-03-14T22:18:34.450250294+08:00 monitor_collect:416868 Server.cc:31 ERROR] Server::setup failed: RPC::ListenFailed(2011)
[2025-03-14T22:18:34.450259443+08:00 monitor_collect:416868 OnePhaseApplication.h:101 FATAL] Setup server failed: RPC::ListenFailed
#这个是源代码
case Address::TCP:
return nic.starts_with("en") || nic.starts_with("eth");
case Address::IPoIB:
return nic.starts_with("ib");
case Address::RDMA:
return nic.starts_with("en") || nic.starts_with("eth");
配置
[server.base.groups.listener]
filter_list = ["bond0", "bond1"] # 或者 "ens4f0np0"
listen_port = 10000
或者修改重新编译:
vim /home/3fs/src/common/net/Listener.cc
static bool checkNicType(std::string_view nic, Address::Type type) {
switch (type) {
case Address::TCP:
return nic.starts_with("en") || nic.starts_with("eth") || nic.starts_with("bond");
case Address::IPoIB:
return nic.starts_with("ib");
case Address::RDMA:
return nic.starts_with("en") || nic.starts_with("eth") || nic.starts_with("bond");
case Address::LOCAL:
return nic.starts_with("lo");
default:
return false;
}
}
这里配置了bond依然异常,是getNetworkInterfaces()和checkNicType()不支持bond*设备吗?这点需要再确认下
2、对应的软件包安装
foundationDB: https://apple.github.io/foundationdb/administration.html
也可以容器化:
docker run -d --net=host --name fdb-server foundationdb/foundationdb:7.1.42
clickhouse: https://clickhouse.com/docs/install
rust: 直接执行:curl https://sh.rustup.rs -sSf | sh
这两个服务的安装比较简单,这里就不再详细说明,需要注意的事,如果foundationDB出现了问题,准备重装时,需要先使用dpkg -P foundadtiondb-server卸载
- 注意: 这里要注意foundadtionDB的版本
FoundationDB
Ensure that the version of FoundationDB client matches the server version, or copy the corresponding version of libfdb_c.so to maintain compatibility.
Find the fdb.cluster file and libfdb_c.so at /etc/foundationdb/fdb.cluster, /usr/lib/libfdb_c.so on nodes with FoundationDB installed.
libfuse 3.16.1 or newer version
FoundationDB 7.1 or newer version
Rust toolchain: minimal 1.75.0, recommended 1.85.0 or newer version (latest stable version)
# 这里使用ustc的源安装rust 非常快
export RUSTUP_DIST_SERVER=https://mirrors.ustc.edu.cn/rust-static
export RUSTUP_UPDATE_ROOT=https://mirrors.ustc.edu.cn/rust-static/rustup
这个是对应的安装包
root@3fs-node-meta01:/home/3fs_sft# ls
clickhouse-client_22.6.2.12_all.deb foundationdb-clients_7.3.35-1_amd64.deb fuse-3.16.1.tar.gz
clickhouse-common-static_22.6.2.12_amd64.deb foundationdb-server_7.3.35-1_amd64.deb
clickhouse-server_22.6.2.12_all.deb fuse-3.16.1
root@3fs-node-meta01:/home/3fs_sft#
# 这点是安装对应的软件
1、 安装OFED
wget https://www.mellanox.com/downloads/DOCA/DOCA_v2.10.0/host/doca-host_2.10.0-093000-25.01-ubuntu2204_amd64.deb
dpkg -i doca-host_2.10.0-093000-25.01-ubuntu2204_amd64.deb
apt-get update
apt-get -y install doca-ofed
2、验证OFED
直接执行ibdev2netdev, 有如下输出表示安装成功(这里同时也需要硬件支持)
root@3fs-node-meta01:/home/3fs_sft# ibdev2netdev
mlx5_bond_0 port 1 ==> bond1 (Up)
# libfuse
wget https://github.com/libfuse/libfuse/releases/download/fuse-3.16.1/fuse-3.16.1.tar.gz
tar vzxf fuse-3.16.1.tar.gz
cd fuse-3.16.1/
mkdir build && cd build
apt install -y meson
meson setup ..
ninja && ninja install
3、编译
- 物理安装:
# 依赖安装
# for Ubuntu 20.04.
apt install cmake libuv1-dev liblz4-dev liblzma-dev libdouble-conversion-dev libdwarf-dev libunwind-dev \
libaio-dev libgflags-dev libgoogle-glog-dev libgtest-dev libgmock-dev clang-format-14 clang-14 clang-tidy-14 lld-14 \
libgoogle-perftools-dev google-perftools libssl-dev libclang-rt-14-dev gcc-10 g++-10 libboost1.71-all-dev
# for Ubuntu 22.04.
apt install cmake libuv1-dev liblz4-dev liblzma-dev libdouble-conversion-dev libdwarf-dev libunwind-dev \
libaio-dev libgflags-dev libgoogle-glog-dev libgtest-dev libgmock-dev clang-format-14 clang-14 clang-tidy-14 lld-14 \
libgoogle-perftools-dev google-perftools libssl-dev gcc-12 g++-12 libboost-all-dev
# for openEuler 2403sp1
yum install cmake libuv-devel lz4-devel xz-devel double-conversion-devel libdwarf-devel libunwind-devel \
libaio-devel gflags-devel glog-devel gtest-devel gmock-devel clang-tools-extra clang lld \
gperftools-devel gperftools openssl-devel gcc gcc-c++ boost-devel
Build 3FS in build folder:
cmake -S . -B build -DCMAKE_CXX_COMPILER=clang++-14 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
cmake --build build -j 32
- 容器化安装
root@3fs-meta:/home/3fs# cat Dockerfile
# Author: mmwei3
# Email: mmwei3@iflytek.com
# date: 20250313
# -------------------------------------------
# Stage 1: Build 3FS from source
# -------------------------------------------
FROM ubuntu:22.04 AS builder
# 时区非交互安装
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
git cmake make clang-14 clang++-14 libfuse3-dev libssl-dev pkg-config \
curl ca-certificates wget unzip \
# FoundationDB client library headers (可选, 也可自行下载)
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# 安装 Rust 工具链 (如果 3FS 需要 Rust >=1.68)
RUN export RUSTUP_DIST_SERVER=https://mirrors.ustc.edu.cn/rust-static
RUN export RUSTUP_UPDATE_ROOT=https://mirrors.ustc.edu.cn/rust-static/rustup
RUN curl --proto '=https' --tlsv1.2 | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
WORKDIR /build
# 克隆 3FS 源码
#RUN 3fs /build
# -------------------------------------------
# Stage 2: Create minimal runtime image
# -------------------------------------------
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
libfuse3-3 libssl3 ca-certificates \
cmake libuv1-dev liblz4-dev liblzma-dev libdouble-conversion-dev libdwarf-dev libunwind-dev \
libaio-dev libgflags-dev libgoogle-glog-dev libgtest-dev libgmock-dev clang-format-14 clang-14 clang-tidy-14 lld-14 \
libgoogle-perftools-dev google-perftools libssl-dev gcc-12 g++-12 libboost-all-dev \
cmake make clang-14 clang++-14 libfuse3-dev libssl-dev pkg-config \
curl ca-certificates wget unzip \
# FoundationDB 客户端库 (fdb.cluster 需自行挂载)
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# 创建目录结构
RUN mkdir -p /opt/3fs/bin /opt/3fs/etc /var/log/3fs
# 从 builder 复制 3FS 可执行文件
#COPY --from=builder /build/build/bin/* /opt/3fs/bin/
COPY bin/* /opt/3fs/bin/
# entrypoint 脚本,用来根据参数启动不同角色
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
WORKDIR /opt/3fs
ENTRYPOINT ["/entrypoint.sh"]
CMD ["help"]
root@3fs-meta:/home/3fs# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
mmwei3.images.build/deepseek-3fs 20250314-5 80be8c8bda15 2 days ago 4.12GB
docker run -d -it --net=host --name 3fs-mgmtd --privileged -v /mycluster/config:/opt/3fs/etc -v /mycluster/logs:/var/log/3fs --device=/dev/infiniband:/dev/infiniband -v /usr/lib:/usr/lib --cap-add=NET_RAW --cap-add=IPC_LOCK --cap-add=CAP_NET_ADMIN mmwei3.images.build/deepseek-3fs:20250314-6 mgmtd
# 对应的/entrypoint.sh
root@3fs-meta:/home/3fs# cat entrypoint.sh
#!/usr/bin/env bash
# Author: mmwei3
# Email: mmwei3@iflytek.com
# Date: 2025-03-13
set -e
ROLE="$1"
CFG_DIR="/opt/3fs/etc"
BIN_DIR="/opt/3fs/bin"
LOG_DIR="/var/log/3fs"
# 确保日志目录存在
mkdir -p "$LOG_DIR"
# 如果未提供角色参数,则显示帮助信息
if [[ -z "$ROLE" || "$ROLE" == "help" ]]; then
echo "Usage: docker run <options> mmwei3.images.build/deepseek-3fs:20250313 <role>"
echo "Available roles:"
echo " mgmtd - Start management daemon"
echo " meta - Start metadata service"
echo " storage - Start storage service"
echo " client - Start FUSE client"
echo " admin-cli - Start interactive admin CLI"
echo "Example:"
echo " docker run --rm deepseek-3fs:latest mgmtd"
exit 1
fi
# 角色处理
case "$ROLE" in
mgmtd)
exec "$BIN_DIR/mgmtd_main" -cfg "$CFG_DIR/mgmtd_main.toml"
;;
meta)
exec "$BIN_DIR/meta_main" -cfg "$CFG_DIR/meta_main.toml"
;;
storage)
exec "$BIN_DIR/storage_main" -cfg "$CFG_DIR/storage_main.toml"
;;
client)
exec "$BIN_DIR/hf3fs_fuse_main" -cfg "$CFG_DIR/hf3fs_fuse_main.toml"
;;
admin-cli)
exec "$BIN_DIR/admin_cli" -cfg "$CFG_DIR/admin_cli.toml"
;;
*)
echo "Error: Unknown role '$ROLE'"
echo "Run 'docker run ... deepseek-3fs:latest help' for usage instructions."
exit 1
;;
esac
4、部署
在meta节点操作:
由于我这里之前做了bond,所以需要先虚拟一个地址
ip link add eno3 type dummy
ip addr add 11.12.63.54/24 dev eno3
ip link set eno3 up
4.1 部署monitor_collector_main
Install monitor_collector
service on the meta node.
- Copy
monitor_collector_main
to/opt/3fs/bin
and config files to/opt/3fs/etc
, and create log directory/var/log/3fs
.mkdir -p /opt/3fs/{bin,etc} mkdir -p /var/log/3fs cp /root/3fs/build/bin/monitor_collector_main /opt/3fs/bin cp /root/3fs/configs/monitor_collector_main.toml /opt/3fs/etc
- Update
monitor_collector_main.toml
to add a ClickHouse connection:[server.monitor_collector.reporter] type = 'clickhouse' [server.monitor_collector.reporter.clickhouse] db = '3fs' host = '172.20.99.94' passwd = '7cmwyBmw' port = '6000' user = 'default'
- Start monitor service:
cp /root/3fs/deploy/systemd/monitor_collector_main.service /usr/lib/systemd/system systemctl start monitor_collector_main
Note that
- Multiple instances of monitor services can be deployed behind a virtual IP address to share the traffic.
- Other services communicate with the monitor service over a TCP connection.
root@3fs-node-meta01:/opt/3fs/etc# clickhouse-client -n < /home/3fs/deploy/sql/3fs-monitor.sql
root@3fs-node-meta01:/opt/3fs/etc# cp /home/3fs/build/bin/monitor_collector_main /opt/3fs/bin
root@3fs-node-meta01:/opt/3fs/etc# cp /home/3fs/configs/monitor_collector_main.toml /opt/3fs/etc
root@3fs-node-meta01:/opt/3fs/etc# cat monitor_collector_main.toml
[common]
cluster_id = 'stage'
[common.ib_devices]
allow_unknown_zone = true
default_network_zone = 'UNKNOWN'
device_filter = []
subnets = []
[[common.log.categories]]
categories = [ '.' ]
handlers = [ 'normal', 'err', 'fatal' ]
inherit = true
level = 'INFO'
propagate = 'NONE'
[[common.log.handlers]]
async = true
file_path = '/var/log/3fs/monitor_collector_main.log'
max_file_size = '100MB'
max_files = 10
name = 'normal'
rotate = true
rotate_on_open = false
start_level = 'NONE'
stream_type = 'STDERR'
writer_type = 'FILE'
[[common.log.handlers]]
async = false
file_path = '/var/log/3fs/monitor_collector_main-err.log'
max_file_size = '100MB'
max_files = 10
name = 'err'
rotate = true
rotate_on_open = false
start_level = 'ERR'
stream_type = 'STDERR'
writer_type = 'FILE'
[[common.log.handlers]]
async = false
file_path = '/var/log/3fs/monitor_collector_main-fatal.log'
max_file_size = '100MB'
max_files = 10
name = 'fatal'
rotate = true
rotate_on_open = false
start_level = 'FATAL'
stream_type = 'STDERR'
writer_type = 'STREAM'
[server.base.independent_thread_pool]
bg_thread_pool_stratetry = 'SHARED_QUEUE'
collect_stats = false
enable_work_stealing = false
io_thread_pool_stratetry = 'SHARED_QUEUE'
num_bg_threads = 2
num_connect_threads = 2
num_io_threads = 2
num_proc_threads = 2
proc_thread_pool_stratetry = 'SHARED_QUEUE'
[server.base.thread_pool]
bg_thread_pool_stratetry = 'SHARED_QUEUE'
collect_stats = false
enable_work_stealing = false
io_thread_pool_stratetry = 'SHARED_QUEUE'
num_bg_threads = 2
num_connect_threads = 2
num_io_threads = 2
num_proc_threads = 2
proc_thread_pool_stratetry = 'SHARED_QUEUE'
[[server.base.groups]]
#default_timeout = '1s'
#drop_connections_interval = '1h'
network_type = 'TCP'
services = [ 'MonitorCollector' ]
use_independent_thread_pool = false
[server.base.groups.io_worker]
num_event_loop = 1
rdma_connect_timeout = '5s'
read_write_rdma_in_event_thread = false
read_write_tcp_in_event_thread = false
tcp_connect_timeout = '1s'
wait_to_retry_send = '100ms'
[server.base.groups.io_worker.ibsocket]
buf_ack_batch = 8
buf_signal_batch = 8
buf_size = 16384
drop_connections = 0
event_ack_batch = 128
#gid_index = 0
max_rd_atomic = 16
max_rdma_wr = 128
max_rdma_wr_per_post = 32
max_sge = 16
min_rnr_timer = 1
pkey_index = 0
record_bytes_per_peer = false
record_latency_per_peer = false
retry_cnt = 7
rnr_retry = 0
send_buf_cnt = 32
sl = 0
start_psn = 0
timeout = 14
traffic_class = 0
[server.base.groups.io_worker.transport_pool]
max_connections = 1
[server.base.groups.listener]
filter_list = ['eno4']
listen_port = 10000
listen_queue_depth = 4096
rdma_listen_ethernet = true
reuse_port = false
[server.base.groups.processor]
enable_coroutines_pool = true
max_coroutines_num = 256
max_processing_requests_num = 4096
[server.monitor_collector]
batch_commit_size = 4096
conn_threads = 32
queue_capacity = 204800
[server.monitor_collector.reporter]
type = 'clickhouse'
[server.monitor_collector.reporter.clickhouse]
db = '3fs'
host = '172.20.99.94'
passwd = 'thinkbig1'
port = '6000'
user = 'default'
root@3fs-node-meta01:/opt/3fs/etc# systemctl status monitor_collector_main.service
● monitor_collector_main.service - monitor_collector_main Server
Loaded: loaded (/lib/systemd/system/monitor_collector_main.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2025-03-16 17:00:20 CST; 2h 34min ago
Main PID: 154445 (monitor_collect)
Tasks: 59 (limit: 154032)
Memory: 291.8M
CPU: 11.241s
CGroup: /system.slice/monitor_collector_main.service
└─154445 /opt/3fs/bin/monitor_collector_main --cfg /opt/3fs/etc/monitor_collector_main.toml
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690640766+08:00 monitor_collect:154445 LogConfig.cc>
Mar 16 17:00:20 3fs-node-meta01 monitor_collector_main[154445]: [2025-03-16T17:00:20.690669262+08:00 monitor_collect:154445 OnePhaseAppl>
lines 1-20/20 (END)
Step 2: Admin client
Install admin_cli
on all 所有节点都要安装
- Copy
admin_cli
to/opt/3fs/bin
and config files to/opt/3fs/etc
.mkdir -p /opt/3fs/{bin,etc} rsync -avz meta:~/3fs/build/bin/admin_cli /opt/3fs/bin rsync -avz meta:~/3fs/configs/admin_cli.toml /opt/3fs/etc rsync -avz meta:/etc/foundationdb/fdb.cluster /opt/3fs/etc # 单点 cp /root/3fs/build/bin/admin_cli /opt/3fs/bin cp /root/3fs/configs/admin_cli.toml /opt/3fs/etc cp /etc/foundationdb/fdb.cluster /opt/3fs/etc
- Update
admin_cli.toml
to setcluster_id
andclusterFile
:cluster_id = "stage" [fdb] clusterFile = '/opt/3fs/etc/fdb.cluster'
The full help documentation for admin_cli
can be displayed by running the following command:
root@3fs-node-meta01:/opt/3fs/etc# /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml help
bench Usage: bench [--rank VAR] [--timeout VAR] [--coroutines VAR] [--seconds VAR] [--remove] path
cd Usage: cd [-L] [--inode] path
checksum Usage: checksum [--list] [--batch VAR] [--md5] [--fillZero] [--output VAR] path
create Usage: create [--perm VAR] [--chain-table-id VAR] [--chain-table-ver VAR] [--chain-list VAR] [--chunk-size VAR] [--stripe-size VAR] path
create-range Usage: create-range [--concurrency VAR] prefix inclusive_start exclusive_end
create-target Usage: create-target --node-id VAR --disk-index VAR --target-id VAR --chain-id VAR [--add-chunk-size] [--chunk-size VAR...] [--use-new-chunk-engine]
create-targets Usage: create-targets --node-id VAR [--disk-index VAR...] [--allow-existing-target] [--add-chunk-size] [--use-new-chunk-engine]
current-user Usage: current-user
decode-user-token Usage: decode-user-token token
drop-user-cache Usage: drop-user-cache [--uid VAR] [--all]
Step 3: Mgmtd service
Install mgmtd
service on meta node.
-
Copy
mgmtd_main
to/opt/3fs/bin
and config files to/opt/3fs/etc
.cp /root/3fs/build/bin/mgmtd_main /opt/3fs/bin cp /root/3fs/configs/{mgmtd_main.toml,mgmtd_main_launcher.toml,mgmtd_main_app.toml} /opt/3fs/etc
-
Update config files:
- Set mgmtd
node_id = 1
inmgmtd_main_app.toml
. - Edit
mgmtd_main_launcher.toml
to set thecluster_id
andclusterFile
:
cluster_id = "stage" [fdb] clusterFile = '/opt/3fs/etc/fdb.cluster'
- Set monitor address in
mgmtd_main.toml
:
[common.monitor.reporters.monitor_collector] remote_ip = "192.168.1.1:10000" # 这里替换成自己的monitor节点的TCP地址 不能是RDMA的,否则失败
- Set mgmtd
-
Initialize the cluster:
root@3fs-node-meta01:/opt/3fs/etc# /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "init-cluster --mgmtd /opt/3fs/etc/mgmtd_main.toml 1 1048576 16" Init filesystem, root directory layout: chain table ChainTableId(1), chunksize 1048576, stripesize 16 Init config for MGMTD version 1
The parameters of
admin_cli
:1
the chain table ID1048576
the chunk size in bytes16
the file strip size
Run
help init-cluster
for full documentation. -
Start mgmtd service:
cp /root/3fs/deploy/systemd/mgmtd_main.service /usr/lib/systemd/system systemctl start mgmtd_main
root@3fs-node-meta01:/opt/3fs/etc# systemctl status mgmtd_main.service ● mgmtd_main.service - mgmtd_main Server Loaded: loaded (/lib/systemd/system/mgmtd_main.service; disabled; vendor preset: enabled) Active: active (running) since Sun 2025-03-16 17:01:48 CST; 2h 34min ago Main PID: 154612 (mgmtd_main) Tasks: 37 (limit: 154032) Memory: 267.5M CPU: 37.587s CGroup: /system.slice/mgmtd_main.service └─154612 /opt/3fs/bin/mgmtd_main --launcher_cfg /opt/3fs/etc/mgmtd_main_launcher.toml --app-cfg /opt/3fs/etc/mgmtd_main_app> Mar 16 17:01:48 3fs-node-meta01 mgmtd_main[154612]: [2025-03-16T17:01:48.410318530+08:00 mgmtd_main:154612 LogConfig.cc:96 INFO] "fa>
-
Run
list-nodes
command to check if the cluster has been successfully initialized:/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "list-nodes"
If multiple instances of mgmtd
services deployed, one of the mgmtd
services is elected as the primary; others are secondaries. Automatic failover occurs when the primary fails.
Step 4: Meta service
Install meta
service on meta node.
- Copy
meta_main
to/opt/3fs/bin
and config files to/opt/3fs/etc
.cp ~/3fs/build/bin/meta_main /opt/3fs/bin cp ~/3fs/configs/{meta_main_launcher.toml,meta_main.toml,meta_main_app.toml} /opt/3fs/etc
- Update config files:
- Set meta
node_id = 100
inmeta_main_app.toml
. - Set
cluster_id
,clusterFile
and mgmtd address inmeta_main_launcher.toml
:
cluster_id = "stage" [mgmtd_client] mgmtd_server_addresses = ["RDMA://192.168.1.1:8000"]
- Set mgmtd and monitor addresses in
meta_main.toml
.
[server.mgmtd_client] mgmtd_server_addresses = ["RDMA://192.168.1.1:8000"] [common.monitor.reporters.monitor_collector] remote_ip = "192.168.1.1:10000" [server.fdb] clusterFile = '/opt/3fs/etc/fdb.cluster'
- Set meta
- Config file of meta service is managed by mgmtd service. Use
admin_cli
to upload the config file to mgmtd:/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "set-config --type META --file /opt/3fs/etc/meta_main.toml"
- Start meta service:
cp ~/3fs/deploy/systemd/meta_main.service /usr/lib/systemd/system systemctl start meta_main
- Run
list-nodes
command to check if meta service has joined the cluster:/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "list-nodes"
If multiple instances of meta
services deployed, meta requests will be evenly distributed to all instances.
Step 5: Storage service
Install storage
service on storage node.
- Format the attached 16 SSDs as XFS and mount at
/storage/data{1..16}
, then create data directories/storage/data{1..16}/3fs
and log directory/var/log/3fs
.mkdir -p /storage/data{1..16} mkdir -p /var/log/3fs for i in {1..16};do mkfs.xfs -L data${i} /dev/nvme${i}n1;mount -o noatime,nodiratime -L data${i} /storage/data${i};done mkdir -p /storage/data{1..16}/3fs
- Increase the max number of asynchronous aio requests:
sysctl -w fs.aio-max-nr=67108864
- Copy
storage_main
to/opt/3fs/bin
and config files to/opt/3fs/etc
.rsync -avz meta:~/3fs/build/bin/storage_main /opt/3fs/bin rsync -avz meta:~/3fs/configs/{storage_main_launcher.toml,storage_main.toml,storage_main_app.toml} /opt/3fs/etc
- Update config files:
- Set
node_id
instorage_main_app.toml
. Each storage service is assigned a unique id between10001
and10005
. - Set
cluster_id
and mgmtd address instorage_main_launcher.toml
.
cluster_id = "stage" [mgmtd_client] mgmtd_server_addresses = ["RDMA://192.168.1.1:8000"]
- Add target paths in
storage_main.toml
:
[server.mgmtd] mgmtd_server_address = ["RDMA://192.168.1.1:8000"] [common.monitor.reporters.monitor_collector] remote_ip = "192.168.1.1:10000" [server.targets] target_paths = ["/storage/data1/3fs","/storage/data2/3fs","/storage/data3/3fs","/storage/data4/3fs","/storage/data5/3fs","/storage/data6/3fs","/storage/data7/3fs","/storage/data8/3fs","/storage/data9/3fs","/storage/data10/3fs","/storage/data11/3fs","/storage/data12/3fs","/storage/data13/3fs","/storage/data14/3fs","/storage/data15/3fs","/storage/data16/3fs",]
- Set
- Config file of storage service is managed by mgmtd service. Use
admin_cli
to upload the config file to mgmtd:/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "set-config --type STORAGE --file /opt/3fs/etc/storage_main.toml"
- Start storage service:
rsync -avz meta:~/3fs/deploy/systemd/storage_main.service /usr/lib/systemd/system systemctl start storage_main
- Run
list-nodes
command to check if storage service has joined the cluster:/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "list-nodes"
Step 6: Create admin user, storage targets and chain table
- Create an admin user:
The admin token is printed to the console, save it to/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "user-add --root --admin 0 root"
/opt/3fs/etc/token.txt
. - Generate
admin_cli
commands to create storage targets on 5 storage nodes (16 SSD per node, 6 targets per SSD).- Follow instructions at here to install Python packages.
The following 3 files will be generated inpip install -r ~/3fs/deploy/data_placement/requirements.txt python ~/3fs/deploy/data_placement/src/model/data_placement.py \ -ql -relax -type CR --num_nodes 5 --replication_factor 3 --min_targets_per_disk 6 python ~/3fs/deploy/data_placement/src/setup/gen_chain_table.py \ --chain_table_type CR --node_id_begin 10001 --node_id_end 10005 \ --num_disks_per_node 16 --num_targets_per_disk 6 \ --target_id_prefix 1 --chain_id_prefix 9 \ --incidence_matrix_path output/DataPlacementModel-v_5-b_10-r_6-k_3-λ_2-lb_1-ub_1/incidence_matrix.pickle
output
directory:create_target_cmd.txt
,generated_chains.csv
, andgenerated_chain_table.csv
. - Create storage targets:
/opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' --config.user_info.token $(<"/opt/3fs/etc/token.txt") < output/create_target_cmd.txt
- Upload chains to mgmtd service:
/opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' --config.user_info.token $(<"/opt/3fs/etc/token.txt") "upload-chains output/generated_chains.csv"
- Upload chain table to mgmtd service:
/opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' --config.user_info.token $(<"/opt/3fs/etc/token.txt") "upload-chain-table --desc stage 1 output/generated_chain_table.csv"
- List chains and chain tables to check if they have been correctly uploaded:
/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "list-chains" /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "list-chain-tables"
Step 7: FUSE client
For simplicity FUSE client is deployed on the meta node in this guide. However, we strongly advise against deploying clients on service nodes in production environment.
- Copy
hf3fs_fuse_main
to/opt/3fs/bin
and config files to/opt/3fs/etc
.cp ~/3fs/build/bin/hf3fs_fuse_main /opt/3fs/bin cp ~/3fs/configs/{hf3fs_fuse_main_launcher.toml,hf3fs_fuse_main.toml,hf3fs_fuse_main_app.toml} /opt/3fs/etc
- Create the mount point:
mkdir -p /3fs/stage
- Set cluster ID, mountpoint, token file and mgmtd address in
hf3fs_fuse_main_launcher.toml
cluster_id = "stage" mountpoint = '/3fs/stage' token_file = '/opt/3fs/etc/token.txt' [mgmtd_client] mgmtd_server_addresses = ["RDMA://192.168.1.1:8000"]
- Set mgmtd and monitor address in
hf3fs_fuse_main.toml
.[mgmtd] mgmtd_server_addresses = ["RDMA://192.168.1.1:8000"] [common.monitor.reporters.monitor_collector] remote_ip = "192.168.1.1:10000"
- Config file of FUSE client is also managed by mgmtd service. Use
admin_cli
to upload the config file to mgmtd:/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://192.168.1.1:8000"]' "set-config --type FUSE --file /opt/3fs/etc/hf3fs_fuse_main.toml"
- Start FUSE client:
cp ~/3fs/deploy/systemd/hf3fs_fuse_main.service /usr/lib/systemd/system systemctl start hf3fs_fuse_main
- Check if 3FS has been mounted at
/3fs/stage
:mount | grep '/3fs/stage'
root@3fs-node-meta01:/opt/3fs/etc# df -Th
Filesystem Type Size Used Avail Use% Mounted on
tmpfs tmpfs 13G 2.6M 13G 1% /run
/dev/sdy3 ext4 437G 59G 356G 15% /
tmpfs tmpfs 63G 16K 63G 1% /dev/shm
tmpfs tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/sdy2 ext4 974M 234M 673M 26% /boot
/dev/sdy1 vfat 1.1G 6.1M 1.1G 1% /boot/efi
tmpfs tmpfs 13G 4.0K 13G 1% /run/user/0
hf3fs.stage fuse.hf3fs 81T 611G 81T 1% /3fs/stage
root@3fs-node-meta01:/opt/3fs/etc# df -Th | grep ^C
root@3fs-node-meta01:/opt/3fs/etc# mount | grep 3fs
hf3fs.stage on /3fs/stage type fuse.hf3fs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=1048576)
root@3fs-node-meta01:/opt/3fs/etc#
5、遇到的问题
1、部分库文件需要拷贝过去
root@3fs-node-meta01:/opt/3fs/etc# cp /home/3fs/third_party/jemalloc/lib/libjemalloc.so.2 /usr/lib/
2、服务异常需要看对应的日志文件,日志文件很清晰
root@3fs-node-meta01:/opt/3fs/etc# /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://11.12.63.55:8000"]' "list-nodes"
Id Type Status Hostname Pid Tags LastHeartbeatTime ConfigVersion ReleaseVersion
1 MGMTD PRIMARY_MGMTD 3fs-node-meta01 154612 [] N/A 1(UPTODATE) 250228-dev-1-999999-c450ee0c
100 META HEARTBEAT_CONNECTED 3fs-node-meta01 159886 [] 2025-03-16 19:41:06 1(UPTODATE) 250228-dev-1-999999-c450ee0c
10001 STORAGE HEARTBEAT_CONNECTED 3fs-node-storage001 928385 [] 2025-03-16 19:41:15 3(UPTODATE) 250228-dev-1-999999-c450ee0c
10002 STORAGE HEARTBEAT_CONNECTED 3fs-node-storage002 676814 [] 2025-03-16 19:41:15 3(UPTODATE) 250228-dev-1-999999-c450ee0c
10003 STORAGE HEARTBEAT_CONNECTED 3fs-node-storage003 676104 [] 2025-03-16 19:41:16 3(UPTODATE) 250228-dev-1-999999-c450ee0c
root@3fs-node-meta01:/opt/3fs/etc#
3、涉及依赖
root@3fs-node-meta01:/opt/3fs/etc# cd deploy/data_placement
root@3fs-node-meta01:/opt/3fs/etc# pip install -r requirements.txt
4、服务端口问题,monitor的识别的是TCP的,如果使用RDMA,发现mgmtd会报错;网卡设备问题 无法识别bond设备
5、后面需要调研下扩缩容、换盘等操作逻辑
5、这里为了操作clickhouse方便,特地写了个小工具
root@3fs-node-meta01:/home/mmwei3# ls
click_tool fdb_tool
root@3fs-node-meta01:/home/mmwei3# cd click_tool/
root@3fs-node-meta01:/home/mmwei3/click_tool# ls
bak_cli.py build clickhouse_tool.egg-info clickhouse_tool.py README.md setup.py
root@3fs-node-meta01:/home/mmwei3/click_tool# cat clickhouse_tool.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
ClickHouse CLI Tool
Developer: mmwei3
Date: 2025-03-12
"""
from clickhouse_driver import Client
import logging
import argparse
import json
# Version information
TOOL_VERSION = "1.0"
DEVELOPER_INFO = "mmwei3 for 20250312"
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
class ClickHouseTool:
def __init__(self, host='172.20.99.94', port=9000, user='default', password='thinkbig1', database='default'):
"""Initialize ClickHouse connection (default connects to 172.20.99.94)"""
try:
self.client = Client(host=host, port=port, user=user, password=password, database=database)
logging.info(f"Successfully connected to ClickHouse server: {host}:{port}, database: {database}")
except Exception as e:
logging.error(f"ClickHouse connection failed: {e}")
raise
def execute_query(self, query, params=None):
"""Execute SQL query"""
try:
result = self.client.execute(query, params)
return result
except Exception as e:
logging.error(f"SQL execution error: {e}")
return None
def insert_data(self, table, data):
"""Insert data"""
if not data:
logging.warning("Data is empty, insert operation skipped")
return
keys = data[0].keys()
values = [tuple(item.values()) for item in data]
query = f"INSERT INTO {table} ({', '.join(keys)}) VALUES"
try:
self.client.execute(query, values, types_check=True)
logging.info("Data inserted successfully")
except Exception as e:
logging.error(f"Insertion failed: {e}")
def select_data(self, query, params=None):
"""Select data"""
result = self.execute_query(query, params)
if result is not None:
logging.info("Query successful")
return result
return []
def update_data(self, query, params=None):
"""Update data"""
self.execute_query(query, params)
logging.info("Update successful")
def delete_data(self, query, params=None):
"""Delete data"""
self.execute_query(query, params)
logging.info("Delete successful")
def close(self):
"""Close connection"""
self.client.disconnect()
logging.info("ClickHouse connection closed")
# CLI Tool
def cli():
parser = argparse.ArgumentParser(
description="ClickHouse CRUD Tool (mmwei3 for 20250312)\n\n"
"Usage examples:\n"
"1. Insert data:\n"
" clickhouse_tool --action insert --table my_table --data '[{\"id\": 1, \"name\": \"Alice\"}]'\n"
"2. Select data:\n"
" clickhouse_tool --action select --query 'SELECT * FROM my_table'\n"
"3. Update data:\n"
" clickhouse_tool --action update --query 'ALTER TABLE my_table UPDATE name=\"Charlie\" WHERE id=1'\n"
"4. Delete data:\n"
" clickhouse_tool --action delete --query 'ALTER TABLE my_table DELETE WHERE id=1'\n"
)
parser.add_argument('--host', type=str, default='172.20.99.94', help="ClickHouse server address (default: 172.20.99.94)")
parser.add_argument('--port', type=int, default=9000, help="ClickHouse port (default: 9000)")
parser.add_argument('--user', type=str, default='default', help="Username (default: default)")
parser.add_argument('--password', type=str, default='thinkbig1', help="Password (default: thinkbig1)")
parser.add_argument('--database', type=str, default='default', help="Database name (default: default)")
parser.add_argument('--action', type=str, choices=['insert', 'select', 'update', 'delete'], required=True,
help="Operation type: insert/select/update/delete")
parser.add_argument('--table', type=str, help="Target table name (only for insert operation)")
parser.add_argument('--query', type=str, help="SQL query (required for select/update/delete)")
parser.add_argument('--data', type=str, help="Data to insert (in JSON format)")
parser.add_argument('--version', action='version', version=f"ClickHouse Tool v{TOOL_VERSION} ({DEVELOPER_INFO})")
args = parser.parse_args()
tool = ClickHouseTool(args.host, args.port, args.user, args.password, args.database)
if args.action == 'insert':
if not args.table or not args.data:
logging.error("Insert operation requires --table and --data")
else:
try:
data = json.loads(args.data)
if isinstance(data, dict):
data = [data]
tool.insert_data(args.table, data)
except json.JSONDecodeError:
logging.error("Data format error, please use JSON format")
elif args.action == 'select':
if not args.query:
logging.error("Select operation requires --query")
else:
result = tool.select_data(args.query)
print("Query result:", result)
elif args.action == 'update':
if not args.query:
logging.error("Update operation requires --query")
else:
tool.update_data(args.query)
elif args.action == 'delete':
if not args.query:
logging.error("Delete operation requires --query")
else:
tool.delete_data(args.query)
tool.close()
if __name__ == '__main__':
cli()
6、部分命令解释
这两条命令分别执行数据放置模型 (data_placement.py
) 和链表生成 (gen_chain_table.py
) 相关的任务。下面是参数的详细解释:
6.1 第一条命令
python /home/3fs/deploy/data_placement/src/model/data_placement.py \
-ql -relax -type CR --num_nodes 3 --replication_factor 3 --min_targets_per_disk 6
作用:
执行数据放置 (data_placement.py
),用于决定如何在存储集群中放置数据块。
参数解析
-ql
- 可能是 “quick launch” 或者 “query log” 之类的选项(需要具体查看
data_placement.py
的代码)。
- 可能是 “quick launch” 或者 “query log” 之类的选项(需要具体查看
-relax
- 可能用于 “relax constraints”(放松约束),用于调整优化模型,允许更多的自由度。
-type CR
- 指定数据放置的类型为
CR
(可能代表某种放置策略,比如 Chain Replication)。
- 指定数据放置的类型为
--num_nodes 3
- 指定集群中有 3 个存储节点(即数据会分布在 3 台机器上)。
--replication_factor 3
- 副本因子 = 3,表示每份数据会存储 3 份副本。
--min_targets_per_disk 6
- 每块磁盘至少要存放 6 个数据目标(可能是文件块、数据分片等)。
6.2 第二条命令
python /home/3fs/deploy/data_placement/src/setup/gen_chain_table.py \
--chain_table_type CR --node_id_begin 10001 --node_id_end 10005 \
--num_disks_per_node 16 --num_targets_per_disk 6 \
--target_id_prefix 1 --chain_id_prefix 9 \
--incidence_matrix_path output/DataPlacementModel-v_5-b_10-r_6-k_3-λ_2-lb_1-ub_1/incidence_matrix.pickle
作用:
生成 链式存储(Chain Table) 的配置表,可能用于存储节点间的数据复制与路由。
参数解析
--chain_table_type CR
- 链表类型为
CR
(可能对应 Chain Replication)。
- 链表类型为
--node_id_begin 10001 --node_id_end 10005
- 生成的链表节点 ID 从 10001 到 10005(表示 5 个存储节点)。
--num_disks_per_node 16
- 每个存储节点有 16 块磁盘。
--num_targets_per_disk 6
- 每块磁盘最多存放 6 个数据目标(可能是分片或文件块)。
--target_id_prefix 1
- 数据目标 ID 以
1
作为前缀(可能用于唯一标识目标)。
- 数据目标 ID 以
--chain_id_prefix 9
- 生成的链 ID 以
9
作为前缀(可能用于标识不同的链组)。
- 生成的链 ID 以
--incidence_matrix_path output/DataPlacementModel-v_5-b_10-r_6-k_3-λ_2-lb_1-ub_1/incidence_matrix.pickle
- 关联到一个
incidence_matrix.pickle
文件,这可能是数据放置模型的关联矩阵,用于表示数据块与磁盘之间的映射关系。
- 关联到一个
第一条命令 (data_placement.py
)
- 计算如何在 3 个存储节点 上 存储数据块,保证 每个数据块有 3 个副本,并 每块磁盘至少存放 6 个数据块。
-ql
和-relax
可能用于优化放置策略。
第二条命令 (gen_chain_table.py
)
- 生成 数据存储的链表结构,支持 5 个存储节点 (10001~10005),每个节点 有 16 块磁盘,每块磁盘 可存 6 个数据目标。
- 可能用于 存储节点间的数据复制、查询优化 或 存储恢复策略。