部署HugeGraph集群
最近工作中,需要部署一个HugeGraph的多副本集群,要求一个主节点,两个从节点。由于HugeGraph官网并没有完整的搭建集群实例,所以这里写一篇文章记录自己搭建集群的过程,供自己和大家学习和参考。
注:本篇文章使用的HugeGraph版本是1.2.0,其他版本应该类似
1. 部署环境及配置要求
配置要求:三台机器,并且装有CentOS-7操作系统,使用Vmware或者云服务器均可
JDK版本:8及以上
安装包:
- HugeGraph-Server:apache-hugegraph-incubating-1.2.0.tar.gz
- Hubble可视化界面:apache-hugegraph-toolchain-incubating-1.2.0.tar.gz
2. 安装JDK
官网推荐我们使用JDK11,但目前生产环境仍然使用JDK8较多,其实在部署过程中使用JDK8也是没问题的。
关于安装JDK的问题,可以参考我上一篇部署HugeGraph单机节点那篇文章,那里有详细的安装过程,这里就不详细赘述了。
我这里已经安装了JDK11,如图所示:
3. 安装HugeGraph
-
在三台机器上分别解压缩HugeGraph压缩包
tar -zxvf apache-hugegraph-incubating-1.2.0.tar.gz
-
修改hugegraph.properties配置文件,完整配置文件如下(需要进行修改的加了中文注释):
# gremlin entrance to create graph # auth config: org.apache.hugegraph.auth.HugeFactoryAuthProxy gremlin.graph=org.apache.hugegraph.HugeFactory # cache config #schema.cache_capacity=100000 # vertex-cache default is 1000w, 10min expired vertex.cache_type=l2 #vertex.cache_capacity=10000000 #vertex.cache_expire=600 # edge-cache default is 100w, 10min expired edge.cache_type=l2 #edge.cache_capacity=1000000 #edge.cache_expire=600 # schema illegal name template #schema.illegal_name_regex=\s+|~.* #vertex.default_label=vertex backend=rocksdb serializer=binary store=hugegraph #打开raft模式 raft.mode=true raft.safe_read=false raft.use_snapshot=false # 当前节点的端点信息,用于rpc同步日志和心跳,其他机器要修改为对应的IP和端口 raft.endpoint=192.168.75.111:8091 # 同组的全部节点的端点信息,包括自己 raft.group_peers=192.168.75.111:8091,192.168.75.112:8091,192.168.75.113:8091 raft.path=./raft-log raft.use_replicator_pipeline=true raft.election_timeout=10000 raft.snapshot_interval=3600 raft.backend_threads=48 raft.read_index_threads=8 raft.queue_size=16384 raft.queue_publish_timeout=60 raft.apply_batch=1 raft.rpc_threads=80 raft.rpc_connect_timeout=5000 raft.rpc_timeout=60000 search.text_analyzer=jieba search.text_analyzer_mode=INDEX # rocksdb backend config # 配置rocksdb存储路径 rocksdb.data_path=./hugegraph-data rocksdb.wal_path=./hugegraph-data # cassandra backend config cassandra.host=localhost cassandra.port=9042 cassandra.username= cassandra.password= #cassandra.connect_timeout=5 #cassandra.read_timeout=20 #cassandra.keyspace.strategy=SimpleStrategy #cassandra.keyspace.replication=3 # hbase backend config #hbase.hosts=localhost #hbase.port=2181 #hbase.znode_parent=/hbase #hbase.threads_max=64 # IMPORTANT: recommend to modify the HBase partition number # by the actual/env data amount & RS amount before init store # It will influence the load speed a lot #hbase.enable_partition=true #hbase.vertex_partitions=10 #hbase.edge_partitions=30 # mysql backend config #jdbc.driver=com.mysql.jdbc.Driver #jdbc.url=jdbc:mysql://127.0.0.1:3306 #jdbc.username=root #jdbc.password= #jdbc.reconnect_max_times=3 #jdbc.reconnect_interval=3 #jdbc.ssl_mode=false # postgresql & cockroachdb backend config #jdbc.driver=org.postgresql.Driver #jdbc.url=jdbc:postgresql://localhost:5432/ #jdbc.username=postgres #jdbc.password= #jdbc.postgresql.connect_database=template1 # palo backend config #palo.host=127.0.0.1 #palo.poll_interval=10 #palo.temp_dir=./palo-data #palo.file_limit_size=32
-
修改rest-server.properties配置文件,完整配置如下所示(需要进行修改的加了中文注释):
# bind url restserver.url=http://0.0.0.0:8080 # gremlin server url, need to be consistent with host and port in gremlin-server.yaml #gremlinserver.url=http://127.0.0.1:8182 graphs=./conf/graphs # The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0 batch.max_write_ratio=80 batch.max_write_threads=0 # configuration of arthas arthas.telnet_port=8562 arthas.http_port=8561 arthas.ip=0.0.0.0 arthas.disabled_commands=jad # authentication configs # choose 'org.apache.hugegraph.auth.StandardAuthenticator' or # 'org.apache.hugegraph.auth.ConfigAuthenticator' #auth.authenticator= # for StandardAuthenticator mode #auth.graph_store=hugegraph # auth client config #auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897 # for ConfigAuthenticator mode #auth.admin_token= #auth.user_tokens=[] # rpc server configs for multi graph-servers or raft-servers # 修改为机器对应的IP地址 rpc.server_host=192.168.75.111 rpc.server_port=8091 #rpc.server_timeout=30 # rpc client configs (like enable to keep cache consistency) #rpc.remote_url=127.0.0.1:8091,127.0.0.1:8092,127.0.0.1:8093 #rpc.client_connect_timeout=20 #rpc.client_reconnect_period=10 #rpc.client_read_timeout=40 #rpc.client_retries=3 #rpc.client_load_balancer=consistentHash # raft group initial peers # 其他机器和自身机器的RPC配置 raft.group_peers=192.168.75.111:8091,192.168.75.112:8091,192.168.75.113:8091 # lightweight load balancing (beta) # 每台机器的server.id不同,其他两个机器可以配置为:server-2、server-3 server.id=server-1 # 只需要一个master,其余两个配置成worker server.role=master # slow query log log.slow_query_threshold=1000
-
初始化数据
# 进入bin目录执行如下命令 bash init-store.sh
-
启动服务器,先启动master,后启动两个worker
# 进入bin目录执行如下命令 bash start-hugegraph.sh
如果能观察到如下日志,则表示部署成功,
查看logs目录下的hugegraph-server.log
4. 安装Hubble可视化界面
Hubble可视化界面只需要部署在一台机器即可
-
解压缩Hubble压缩包
tar -zxvf apache-hugegraph-toolchain-incubating-1.2.0.tar.gz
-
进入/apache-hugegraph-toolchain-incubating-1.2.0/apache-hugegraph-hubble-incubating-1.2.0/bin目录,执行start-hubble
bash start-hubble.sh
-
浏览器访问IP:8088,如果看到以下界面表示部署成功
-
尝试创建图,下图是创建成功的实例
至此就完成了HugeGraph的集群部署