快速安装,有网环境用,但是我没有用
pip3 install datahub
datahub docker quickstart
一、安装并启动datahub
#python安装客户端
pip3 install datahub
#导入镜像
(base) [root@b28-16p4p170-lijia lijia]# ls *.tar
cp-kafka.tar datahub-actions.tar datahub-frontend-react.tar datahub-kafka-setup.tar datahub-upgrade.tar mysql.tar
cp-schema-registry.tar datahub-elasticsearch-setup.tar datahub-gms.tar datahub-mysql-setup.tar elasticsearch.tar zookeeper.tar
(base) [root@b28-16p4p170-lijia lijia]# for i in `ls *.tar`;do docker load -i $i ;done
#启动容器
docker-compose -f docker-compose-without-neo4j.quickstart.yml up -d
多执行几次
查看
docker ps
访问ip:9002
二、配置导入元数据
1、配置hive元数据
查看插件
datahub check plugins
#安装hive插件
pip3 install 'acryl-datahub[hive]'
(base) [root@b28-16p4p170-lijia opt]# cat recipe.yml
source:
type: hive
config:
database: report_pci_bj
host_port: "172.24.3.101:10000"
sink:
type: "datahub-rest"
config:
server: 'http://localhost:8080'
(base) [root@b28-16p4p170-lijia opt]# datahub ingest -c /opt/recipe.yml
2、配置s3元数据
安装s3插件
pip3 install 'acryl-datahub[s3]'
(base) [root@b28-16p4p170-lijia opt]# cat recipeceph.yml
source:
type: s3
config:
path_specs:
- include: "s3://test1/*.*"
aws_config:
aws_endpoint_url: "http://172.16.4.164:8899"
aws_access_key_id: JIW64W1BBJXWMEWJOPM3
aws_secret_access_key: ZzVsJ4igOx3QcDgnxPC5BhophS4IGI3i5PI3oZ0N
aws_region: us-east-2
sink:
type: "datahub-rest"
config:
server: 'http://localhost:8080'
base) [root@b28-16p4p170-lijia opt]# datahub ingest -c /opt/recipeceph.yml
查看
可以自己查看如何配置元数据
https://datahubproject.io/integrations