centos7 部署Hadoop 2.7.2
前置:配置好免密登录
1、安装JDK1.8
yum -y install java-1.8.0-openjdk*
#验证
java -version
2、下载Hadoop2.7.2安装包及解压(可提前下载好直接上传)
mkdir /opt/server
mkdir /opt/software
cd /opt/software
wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz
tar -xzvf hadoop-2.7.2.tar.gz
mv hadoop-2.7.2 hadoop
mv hadoop /opt/server
cd /opt/server/hadoop
#创建对应目录
mkdir tmp
mkdir -p hdfs/data
mkdir -p hdfs/name
3、设置Java_Home环境变量
#进入/etc/profile文件中
vim /etc/profile
#配置java_home环境变量
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.412.b08-1.el7_9.x86_64/jre
export PATH=$PATH:$JAVA_HOME/bin
4、修改hadoop配置文件
#进入配置目录
cd /opt/server/hadoop/etc/hadoop
4.1、core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/server/hadoop/tmp</value>
<description>Abasefor other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.spark.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.spark.groups</name>
<value>*</value>
</property>
</configuration>
4.2、hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop01:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/server/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/server/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
4.3、mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop01:19888</value>
</property>
</configuration>
4.4、yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop01:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop01:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop01:8088</value>
</property>
</configuration>
4.5、hadoop-env.sh、yarn-env.sh添加JAVA_HOME路径
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.412.b08-1.el7_9.x86_64/jre
5、配置slaves文件
cd /opt/server/hadoop/etc/hadoop
vim slaves
hadoop01
6、在Master服务器启动hadoop
进入/opt/server/hadoop目录
1)初始化,输入命令 ,./bin/hdfs namenode -format
(2)全部启动 sbin/start-all.sh,也可以分开sbin/start-dfs.sh、sbin/start-yarn.sh、sbin/mr-jobhistory-daemon.sh start historyserver
(3)停止的话,输入命令,sbin/stop-all.sh
(4)输入命令,jps,可以看到相关信息
7、Web访问,要先开放端口或者直接关闭防火墙
(1)输入命令,systemctl stop firewalld.service
(2)浏览器打开 http://hadoop01:8088/
(3)浏览器打开 http://hadoop01:50070/
centos7 部署Hive2.3.3
1.下载
hive2.3.3下载
https://archive.apache.org/dist/hive/hive-2.3.3/
下载后上传到/opt/server
#创建文件夹
mkdir -p /opt/server/
cd /opt/server/
#解压
tar -zxvf apache-hive-2.3.3-bin.tar.gz
mv apache-hive-2.3.3-bin hive
#配置环境变量
vim /etc/profile
末尾追加
export HIVE_HOME=/opt/server/hive
export PATH=$PATH:$HIVE_HOME/bin
重新编译环境变量生效
source /etc/profile
2.修改Hive文件
2.1修改hive-env.sh
cd /opt/server/hive/conf
cp hive-env.sh.template hive-env.sh
vim hive-env.sh
# HADOOP_HOME=${bin}/../../hadoop
打开注释修改 HADOOP_HOME=/opt/server/hadoop
# export HIVE_CONF_DIR=
打开注释修改 HIVE_CONF_DIR=/opt/server/hive/conf
2.2 修改hive-log4j.properties
修改hive的log存放日志到/opt/server/hive/logs
cp hive-log4j2.properties.template hive-log4j2.properties
vim hive-log4j2.properties
找到 property.hive.log.dir = ${sys:java.io.tmpdir}/${sys:user.name}
修改 property.hive.log.dir = /opt/server/hive/logs
3、配置MySQL作为Metastore
默认情况下, Hive的元数据保存在了内嵌的 derby 数据库里, 但一般情况下生产环境使用 MySQL 来存放 Hive 元数据。
3.1、安装mysql(版本必须是5.5+)
略
安装mysql,拷贝 mysql-connector-java-5.1.9-bin.jar 放入 $HIVE_HOME/lib 下。
cp mysql-connector-java-5.1.9.jar /opt/server/hive/lib/
3.2、修改配置文件
cp hive-default.xml.template hive-site.xml
vim hive-site.xml
删除命令:光标在configuration的下一行,输入:.,$-1d (光标所在行到 倒数第二行)回车,进行如下编辑
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!--Hive作业的HDFS根目录位置 -->
<property>
<name>hive.exec.scratchdir</name>
<value>/user/hive/tmp</value>
</property>
<!--Hive作业的HDFS根目录创建写权限 -->
<property>
<name>hive.scratch.dir.permission</name>
<value>733</value>
</property>
<!--hdfs上hive元数据存放位置 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<!--连接数据库地址,名称 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop01:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<!--连接数据库驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!--连接数据库用户名称 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<!--连接数据库用户密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
</property>
<!--客户端显示当前查询表的头信息 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<!--客户端显示当前数据库名称信息 -->
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
</configuration>
3.3 mysql创建hive用户密码
# 不同版本的mysql语法不一样,此处是mysql5.7.22
mysql> CREATE DATABASE hive;
mysql> USE hive;
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL ON hive.* TO 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
mysql> FLUSH PRIVILEGES;
mysql> quit;
4 运行Hive
4.1初始化数据库
从Hive 2.1开始,我们需要运行下面的schematool命令作为初始化步骤。例如,这里使用“mysql”作为db类型。
schematool -dbType mysql -initSchema
终端输出如下信息
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/hive-2.3.3/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/hadoop-2.7.6/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://node21/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
4.2 启动 Hive 客户端
命令行执行
hive
centos7 部署Spark2.4.5
1.下载
spark2.4.5下载
https://archive.apache.org/dist/spark/spark-2.4.5/
下载后上传到/opt/server
#创建文件夹
mkdir -p /opt/server/
cd /opt/server/
#解压
tar -zxvf spark-2.4.5-bin-hadoop2.7.tgz
mv spark-2.4.5-bin-hadoop2.7 spark
2.配置spark
1)、spark-env.sh
cd /opt/server/spark/conf
cp spark-env.sh.template spark-env.sh
vim spark-env.sh
HADOOP_CONF_DIR=/opt/server/hadoop/etc/hadoop
YARN_CONF_DIR=/opt/server/hadoop/etc/hadoop
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.412.b08-1.el7_9.x86_64/jre
SPARK_MASTER_HOST=hadoop01
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8080
SPARK_WORKER_CORES=1
SPARK_WORKER_MEMORY=1g
SPARK_WORKER_PORT=7078
SPARK_WORKER_WEBUI_PORT=8081
SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=hdfs://hadoop01:9000/spar
k/eventLogs/ -Dspark.history.fs.cleaner.enabled=true"
2)、slaves
cp slaves.template slaves
vi slaves
删除localhost 新增
hadoop01
3)、spark-defaults.conf
cp spark-defaults.conf.template spark-defaults.conf
vim spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://hadoop01:9000/spark/eventLogs/
spark.eventLog.compress true
spark.yarn.historyServer.address hadoop01:18080
spark.history.ui.port 18080
spark.history.fs.logDirectory hdfs://hadoop01:9000/spark/eventLogs/
4)、log4j.properties
cp log4j.properties.template log4j.properties
vim log4j.properties
# 只修改这一行
log4j.rootCategory=WARN, console
3、配置spark环境变量
vi /etc/profile
export SPARK_HOME=/opt/server/spark
export PATH=$SPARK_HOME/bin/:$SPARK_HOME/sbin/
source /etc/profile
4、创建spark目录日志,配置SparkJar
hdfs dfs -mkdir -p /spark/eventLogs/
hdfs dfs -mkdir -p /spark/apps/jars/ #jar包存放目录
hdfs dfs -put /opt/server/spark/jars/* /spark/apps/jars/
hdfs dfs -put /opt/server/spark/examples/jars/* /spark/apps/jars/
5、启动
1)、重启hadoop
cd /opt/server/hadoop/sbin
./stop-all.sh
./start-all.sh
另外启动日志服务:
./mr-jobhistory-daemon.sh start historyserver
2)、启动spark
cd /opt/server/spark/sbin
./start-all.sh
另外启动spark日志服务:
./start-history-server.sh
#查看服务
jps
[root@hadoop01 spark]# jps
2164 SecondaryNameNode
2325 ResourceManager
8149 Master
8229 Worker
1864 NameNode
2792 JobHistoryServer
1994 DataNode
13357 Jps
2607 NodeManager
6、测试
1)、测试spark
/opt/server/spark/bin/spark-submit --master yarn --class org.apache.spark.examples.SparkPi /opt/server/spark/examples/jars/spark-examples_2.11-2.4.5.jar 10
# 如果不报错出现 Pi is roughly 3.1423111423111423 则运行成功。
解决Centos7下的python2.7默认不带pip工具
1.yum安装
yum install python-pip
如果报无法找到该软件包,则先安装epel再安装pip
yum -y install epel-release
yum install python-pip
2.升级安装好的pip
pip install --upgrade pip
如果升级过程中由于网路问题导致升级失败,可以尝试添加加速源升级
pip install --upgrade pip -i http://pypi.douban.com/simple
测试安装结果
pip -V
如果以上还没解决,将python升级成3.0+版本
将centos7默认的python2.7升级成python3.0+版本
1.yum更新
yum update
2.下载与安装python3
建立新目录
sudo mkdir /usr/local/python3
下载依赖包
yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gcc make libffi-devel
下载安装包
cd /usr/local/python3
wget --no-check-certificate https://www.python.org/ftp/python/3.9.0/Python-3.9.0.tgz
解压安装包 tar
# 解压压缩包
tar -zxvf Python-3.9.0.tgz
编译安装
# 进入文件夹
cd Python-3.9.0
# 配置安装位置
./configure prefix=/usr/local/python3
# 安装
make && make install
3、修改系统的默认Python编译器
删除默认Python软连接
cd /usr/local/python/Python-3.0.9
[root@centos-moxc bin]# rm -rf /usr/bin/python
[root@centos-moxc bin]# ln -s /usr/local/python3/bin/python3.9 /usr/bin/python
[root@centos-moxc bin]# ln -s /usr/local/python3/bin/pip3.9 /usr/bin/pip3
[root@centos-moxc bin]# python3 -V
Python 3.9.0
[root@centos-moxc bin]# pip3 -V
pip 20.2.3 from /usr/local/python3/lib/python3.9/site-packages/pip (python 3.9)
# 查看软连接指向
[root@centos-moxc bin]# ll /usr/bin/ |grep python
-rwxr-xr-x 1 root root 11240 Apr 2 2020 abrt-action-analyze-python
lrwxrwxrwx 1 root root 29 Nov 14 01:04 pip3 -> /usr/local/python3/bin/pip3.9
lrwxrwxrwx 1 root root 7 Sep 3 11:48 python -> python2
lrwxrwxrwx 1 root root 9 Sep 3 11:48 python2 -> python2.7
-rwxr-xr-x 1 root root 7144 Apr 2 2020 python2.7
lrwxrwxrwx 1 root root 32 Nov 14 01:04 python3 -> /usr/local/python3/bin/python3.9
[root@centos-moxc bin]# ll /usr/bin/ |grep pip
-rwxr-xr-x. 1 root root 2291 Jul 31 2015 lesspipe.sh
lrwxrwxrwx 1 root root 29 Nov 14 01:04 pip3 -> /usr/local/python3/bin/pip3.9
5、升级pip
python -m pip install --upgrade pip
6、修改.bashrc文件添加PATH环境变量
vim ~/.bashrc
在内容中最后一行添上:
export PATH=/usr/local/python3/bin/:$PATH
环境变量配置生效:
source ~/.bashrc
7、查看pip版本
pip -V
8、yum报错解决
vim /usr/bin/yum
将yum文件内容中第一行的#!/usr/bin/python 改为:
#!/usr/bin/python2
若yum update报错,需将 /usr/libexec/urlgrabber-ext-down 文件的第一行也改为 #!/usr/bin/python2.7
启动命令:
hadoop:
sbin/start-all.sh
sbin/mr-jobhistory-daemon.sh start historyserver
spark:
sbin/start-all.sh
另外启动spark日志服务:
sbin/start-history-server.sh
部署DataSphere Studio & Linkis 单机一键部署文档
1.验证hadoop、hive、spark
hdfs dfs -ls /
hive -e "show databases"
spark-sql -e "show databases"
2.如果用户的Pyspark想拥有画图功能,则还需在所有安装节点,安装画图模块。命令如下:
python3 -m pip install matplotlib
3.准备安装包(DSS Release-1.1.1)
下载链接:DSS Release-1.1.1
DSS & Linkis 一键安装部署包的层级目录结构如下:
├── dss_linkis # 一键部署主目录
├── bin # 用于一键安装,以及一键启动 DSS + Linkis
├── conf # 一键部署的参数配置目录
├── wedatasphere-dss-x.x.x-dist.tar.gz # DSS后端安装包
├── wedatasphere-dss-web-x.x.x-dist.zip # DSS前端和Linkis前端安装包
├── wedatasphere-linkis-x.x.x-dist.tar.gz # Linkis后端安装包
注意:将下载好的包 上传到/opt/software/并解压到/opt/server/目录下
4.修改配置
- 用户需要对
xx/dss_linkis/conf
目录下的config.sh
和db.sh
进行修改。 - 打开
config.sh
,按需修改相关配置参数,参数说明如下:
#################### 一键安装部署的基本配置 ####################
### deploy user(部署用户,默认为当前登录用户)
deployUser=root
### Linkis_VERSION
LINKIS_VERSION=1.1.1
### DSS Web(本机安装一般无需修改,但需确认此端口是否占用,若被占用,修改一个可用端口即可)
DSS_NGINX_IP=127.0.0.1
DSS_WEB_PORT=8085
### DSS VERSION
DSS_VERSION=1.1.1
############## linkis的其他默认配置信息 start ##############
### Specifies the user workspace, which is used to store the user's script files and log files.
### Generally local directory
##file:// required. 指定用户使用的目录路径,一般用于存储用户的脚本文件和日志文件等,是用户的工作空间
WORKSPACE_USER_ROOT_PATH=file:///tmp/linkis/
### User's root hdfs path
##hdfs:// required. 结果集日志等文件路径,用于存储Job的结果集文件
HDFS_USER_ROOT_PATH=hdfs:///tmp/linkis
### Path to store job ResultSet:file or hdfs path
##hdfs:// required. 结果集日志等文件路径,用于存储Job的结果集文件,如果未配置 使用HDFS_USER_ROOT_PATH的配置
RESULT_SET_ROOT_PATH=hdfs:///tmp/linkis
### Path to store started engines and engine logs, must be local. 存放执行引擎的工作路径,需要部署用户有写权限的本地目录
ENGINECONN_ROOT_PATH=/appcom/tmp
### 基础组件环境信息
###HADOOP CONF DIR #/appcom/config/hadoop-config(用户根据实际情况修改)
HADOOP_CONF_DIR=/opt/server/hadoop/etc/hadoop
###HIVE CONF DIR #/appcom/config/hive-config(用户根据实际情况修改)
HIVE_CONF_DIR=/opt/server/hive/conf
###SPARK CONF DIR #/appcom/config/spark-config(用户根据实际情况修改)
SPARK_CONF_DIR=/opt/server/spark/conf
###for install (用户根据实际情况修改)
LINKIS_PUBLIC_MODULE=lib/linkis-commons/public-module
##YARN REST URL spark engine required(根据实际情况修改IP和端口)
YARN_RESTFUL_URL=http://hadoop01:8088
## Engine version
#SPARK_VERSION(根据实际版本情况修改版本号)
SPARK_VERSION=2.4.5
##HIVE_VERSION(根据实际版本情况修改版本号)
HIVE_VERSION=2.3.3
##PYTHON_VERSION(根据实际版本情况修改版本号)
PYTHON_VERSION=python3
## LDAP is for enterprise authorization, if you just want to have a try, ignore it.
#LDAP_URL=ldap://localhost:1389/
#LDAP_BASEDN=dc=webank,dc=com
#LDAP_USER_NAME_FORMAT=cn=%s@xxx.com,OU=xxx,DC=xxx,DC=com
############## linkis的其他默认配置信息 end ##############
################### The install Configuration of all Linkis's Micro-Services #####################
################### 用户可以根据实际情况修改IP和端口 ###################
#
# NOTICE:
# 1. If you just wanna try, the following micro-service configuration can be set without any settings.
# These services will be installed by default on this machine.
# 2. In order to get the most complete enterprise-level features, we strongly recommend that you install
# the following microservice parameters
#
### EUREKA install information
### You can access it in your browser at the address below:http://${EUREKA_INSTALL_IP}:${EUREKA_PORT}
### Microservices Service Registration Discovery Center
LINKIS_EUREKA_INSTALL_IP=127.0.0.1
LINKIS_EUREKA_PORT=9600
#LINKIS_EUREKA_PREFER_IP=true
### Gateway install information
#LINKIS_GATEWAY_INSTALL_IP=127.0.0.1
为了防止端口被占用,此处由9001修改9011
LINKIS_GATEWAY_PORT=9011
### ApplicationManager
#LINKIS_MANAGER_INSTALL_IP=127.0.0.1
LINKIS_MANAGER_PORT=9101
### EngineManager
#LINKIS_ENGINECONNMANAGER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONNMANAGER_PORT=9102
### EnginePluginServer
#LINKIS_ENGINECONN_PLUGIN_SERVER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONN_PLUGIN_SERVER_PORT=9103
### LinkisEntrance
#LINKIS_ENTRANCE_INSTALL_IP=127.0.0.1
LINKIS_ENTRANCE_PORT=9104
### publicservice
#LINKIS_PUBLICSERVICE_INSTALL_IP=127.0.0.1
LINKIS_PUBLICSERVICE_PORT=9105
### cs
#LINKIS_CS_INSTALL_IP=127.0.0.1
LINKIS_CS_PORT=9108
########## Linkis微服务配置完毕 ##########
################### The install Configuration of all DataSphereStudio's Micro-Services #####################
#################### 非注释的参数必须配置,注释掉的参数可按需修改 ####################
# NOTICE:
# 1. If you just wanna try, the following micro-service configuration can be set without any settings.
# These services will be installed by default on this machine.
# 2. In order to get the most complete enterprise-level features, we strongly recommend that you install
# the following microservice parameters
#
# 用于存储发布到 Schedulis 的临时ZIP包文件
WDS_SCHEDULER_PATH=file:///appcom/tmp/wds/scheduler
### DSS_SERVER
### This service is used to provide dss-server capability.
### project-server
#DSS_FRAMEWORK_PROJECT_SERVER_INSTALL_IP=127.0.0.1
#DSS_FRAMEWORK_PROJECT_SERVER_PORT=9002
### orchestrator-server
#DSS_FRAMEWORK_ORCHESTRATOR_SERVER_INSTALL_IP=127.0.0.1
#DSS_FRAMEWORK_ORCHESTRATOR_SERVER_PORT=9003
### apiservice-server
#DSS_APISERVICE_SERVER_INSTALL_IP=127.0.0.1
#DSS_APISERVICE_SERVER_PORT=9004
### dss-workflow-server
#DSS_WORKFLOW_SERVER_INSTALL_IP=127.0.0.1
#DSS_WORKFLOW_SERVER_PORT=9005
### dss-flow-execution-server
#DSS_FLOW_EXECUTION_SERVER_INSTALL_IP=127.0.0.1
#DSS_FLOW_EXECUTION_SERVER_PORT=9006
###dss-scriptis-server
#DSS_SCRIPTIS_SERVER_INSTALL_IP=127.0.0.1
#DSS_SCRIPTIS_SERVER_PORT=9008
########## DSS微服务配置完毕#####
############## other default configuration 其他默认配置信息 ##############
## java application default jvm memory(Java应用的堆栈大小。如果部署机器的内存少于8G,推荐128M;
## 达到16G时,推荐至少256M;如果想拥有非常良好的用户使用体验,推荐部署机器的内存至少达到32G)
export SERVER_HEAP_SIZE="128M"
##sendemail配置,只影响DSS工作流中发邮件功能
EMAIL_HOST=smtp.163.com
EMAIL_PORT=25
EMAIL_USERNAME=xxx@163.com
EMAIL_PASSWORD=xxxxx
EMAIL_PROTOCOL=smtp
### Save the file path exported by the orchestrator service
ORCHESTRATOR_FILE_PATH=/appcom/tmp/dss
### Save DSS flow execution service log path
EXECUTION_LOG_PATH=/appcom/tmp/dss
############## other default configuration 其他默认配置信息 ##############
- 修改数据库配置。请确保配置的数据库,安装机器可以正常访问,否则将会出现 DDL 和 DML 导入失败的错误,打开
db.sh
,按需修改相关配置参数,参数说明如下:
### 配置DSS数据库
MYSQL_HOST=192.168.182.139
MYSQL_PORT=3306
MYSQL_DB=dss
MYSQL_USER=root
MYSQL_PASSWORD=root
## Hive metastore的数据库配置,用于Linkis访问Hive的元数据信息
HIVE_HOST=192.168.182.139
HIVE_PORT=10000
HIVE_DB=hive
HIVE_USER=hive
HIVE_PASSWORD=hive
5、安装和使用
1.停止机器上所有DSS及Linkis服务
- 若从未安装过DSS及Linkis服务,忽略此步骤
2.将当前目录切换到bin目录
cd xx/dss_linkis/bin
3.执行安装脚本
sh install.sh
-
该安装脚本会检查各项集成环境命令,如果没有请按照提示进行安装,以下命令为必须项:
yum; java; mysql; unzip; expect; telnet; tar; sed; dos2unix; nginx
-
安装时,脚本会询问您是否需要初始化数据库并导入元数据,Linkis 和 DSS 均会询问,第一次安装必须选是
-
通过查看控制台打印的日志信息查看是否安装成功,如果有错误信息,可以查看具体报错原因
-
除非用户想重新安装整个应用,否则该命令执行一次即可
4.启动服务
-
若用户的Linkis安装包是通过自己编译获取且用户想启用数据源管理功能,那么就需要去修改配置以启动该项功能,使用下载的安装包无需操作
## 切换到Linkis配置文件目录 cd xx/dss_linkis/linkis/conf ## 打开配置文件linkis-env.sh vi linkis-env.sh ## 将如下配置改为true export ENABLE_METADATA_MANAGER=true
-
若用户的Linkis安装包是通过自己编译获取,在启动服务前尽量将后续用到的密码改成和部署用户名一致,使用下载的安装包无需操作
## 切换到Linkis配置文件目录 cd xx/dss_linkis/linkis/conf/ ## 打开配置文件linkis-mg-gateway.properties vi linkis-mg-gateway.properties ## 修改密码 wds.linkis.admin.password=hadoop
-
在xx/dss_linkis/bin目录下执行启动服务脚本
sh start-all.sh
-
如果启动产生了错误信息,可以查看具体报错原因。启动后,各项微服务都会进行通信检测,如果有异常则可以帮助用户定位异常日志和原因
5.安装默认Appconn
# 切换目录到dss,正常情况下dss目录就在xx/dss_linkis目录下,
cd xx/dss_linkis/dss/bin
# 执行启动默认Appconn脚本
sh install-default-appconn.sh
- 该命令执行一次即可,除非用户想重新安装整个应用
6.查看验证是否成功
-
用户可以在Eureka界面查看 Linkis & DSS 后台各微服务的启动情况,默认情况下DSS有7个微服务,Linkis有10个微服务(包括启用数据源管理功能后的两个微服务) (Eureka地址在xx/dss_linkis/conf/config.sh有配置)
-
用户可以使用谷歌浏览器访问以下前端地址:
http://DSS_NGINX_IP:DSS_WEB_PORT
启动日志会打印此访问地址(在xx/dss_linkis/conf/config.sh中也配置了此地址)。登陆时默认管理员的用户名和密码均为部署用户为hadoop(用户若想修改密码,可以通过修改 xx/dss_linkis/linkis/conf/linkis-mg-gateway.properties 文件中的 wds.linkis.admin.password 参数)
7.停止服务
sh stop-all.sh