目录
一、knox介绍
二、Ambari配置LDAP认证
三、验证Knox网关
3.1YARNUI
3.2 HDFSUI
3.3 HDFS RestFULL
3.4 SparkHistoryserver
3.5 HBASEUI
一、knox介绍
Apache Knox网关是一个用于与Apache Hadoop部署的REST api和ui交互的应用程序网关。Knox网关为所有与Apache Hadoop集群的REST和HTTP交互提供了一个单一的访问点。
Apache Knox Gateway是一款用于保护Hadoop生态体系安全的代理网关系统,为Hadoop集群提供唯一的代理入口。Knox以类似反向代理的形式挡在集群前面,隐匿部署细节(例如端口号和机器名等),接管所有用户的HTTP请求(例如WEB UI 控制台访问和RESTful 服务调用),以此来保护集群安全。
Knox提供三组面向用户的服务:
代理服务
Apache Knox项目的主要目标是通过代理HTTP资源提供对Apache Hadoop的访问。
身份验证服务
REST API访问的身份验证以及ui的WebSSO流。LDAP/AD,基于报头的预认证,Kerberos,SAML、OAuth都是可用的选项。
客户服务
客户端开发可以通过DSL编写脚本或直接使用Knox Shell类作为SDK来完成。KnoxShell交互式脚本环境将groovy shell的交互式shell与Knox shell SDK类相结合,用于与部署的Hadoop集群中的数据进行交互。
Knox网关本质上是一款基于Jetty实现的高性能反向代理服务器,通过内置的过滤器链来处理URL请求,支持使用LDAP进行用户身份认证。Knox网关在架构设计上具有良好的可扩展性,这种扩展性主要通过Service和Provider这两个扩展性框架来实现。Server扩展性框架还提供了一种网关新增的HTTP或RESTful服务端点的途径,例如WebHDFS就是以新建的Service的形式加入Knox网关的。而Provider扩展性框架则是用来定义并实现相应Service所提供的功能,例如端点的用户认证或是WebHDFS中的文件上传等功能。当我们使用Knox作为代理网关之后,大数据平台中Hadoop系统的逻辑拓扑就会变成如下图所示:
二、Ambari配置LDAP认证
用ambari安装的knox,默认安装目录是 /usr/hdp/xxxxx/knox
。默认cluster-name是default,对应的拓扑配置文件是:
/usr/hdp/current/knox-server/conf/topologies/default.xml
有两种方式创建LDAP服务器,一是手工安装OpenLDAP;二是使用Knox自带的Demo LDAP:
- 如果要手工安装OpenLDAP,参考 Centos7 下 OpenLDAP 安装。
- 如果要使用Knox自带的DemoLDAP服务器,则在Ambari中前往 Services -> Knox -> Service Actions -> Start Demo LDAP。
下面的测试使用自带的LDAP 测试,LDAP上我们使用默认用户admin(dn: uid=admin,ou=people,dc=hadoop,dc=apache,dc=org),该用户的密码是:admin-password。可以安装一个JXplorer来链接LDAP服务器查看其中的数据。
Knox的默认集群拓扑文件(default.xml),内容如下:
<topology>
<gateway>
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<param>
<name>sessionTimeout</name>
<value>30</value>
</param>
<param>
<name>main.ldapRealm</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<param>
<name>main.ldapRealm.userDnTemplate</name>
<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://windp-aio:33389</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
<value>simple</value>
</param>
<param>
<name>urls./**</name>
<value>authcBasic</value>
</param>
</provider>
<provider>
<role>identity-assertion</role>
<name>Default</name>
<enabled>true</enabled>
</provider>
<provider>
<role>authorization</role>
<name>XASecurePDPKnox</name>
<enabled>true</enabled>
</provider>
</gateway>
<service>
<role>NAMENODE</role>
<url>hdfs://windp-aio:8020</url>
</service>
<service>
<role>JOBTRACKER</role>
<url>rpc://windp-aio:8050</url>
</service>
<service>
<role>WEBHDFS</role>
<url>http://windp-aio:50070/webhdfs</url>
</service>
<service>
<role>HDFSUI</role>
<version>2.7.0</version>
<url>http://192.168.2.17:50070</url>
</service>
<service>
<role>HBASEUI</role>
<version>2.1.0</version>
<url>http://192.168.2.17:16010</url>
</service>
<service>
<role>YARNUI</role>
<url>http://windp-aio:8088</url>
</service>
<service>
<role>SPARKHISTORYUI</role>
<url>http://192.168.2.17:18081</url>
</service>
<service>
<role>JOBHISTORYUI</role>
<url>http://192.168.2.17:19888</url>
</service>
<service>
<role>WEBHBASE</role>
<url>http://windp-aio:8080</url>
</service>
<service>
<role>HIVE</role>
<url>http://windp-aio:10001/cliservice</url>
</service>
<service>
<role>RESOURCEMANAGER</role>
<url>http://windp-aio:8088/ws</url>
</service>
</topology>
配置文件: /etc/knox/conf/users.ldif
version: 1
# Please replace with site specific values
dn: dc=hadoop,dc=apache,dc=org
objectclass: organization
objectclass: dcObject
o: Hadoop
dc: hadoop
# Entry for a sample people container
# Please replace with site specific values
dn: ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:organizationalUnit
ou: people
# Entry for a sample end user
# Please replace with site specific values
dn: uid=guest,ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:person
objectclass:organizationalPerson
objectclass:inetOrgPerson
cn: Guest
sn: User
uid: guest
userPassword:guest-password
# entry for sample user admin
dn: uid=admin,ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:person
objectclass:organizationalPerson
objectclass:inetOrgPerson
cn: Admin
sn: Admin
uid: admin
userPassword:admin-password
# entry for sample user admin
dn: uid=winner_spark,ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:person
objectclass:organizationalPerson
objectclass:inetOrgPerson
cn: winner_spark
sn: winner_spark
uid: winner_spark
userPassword:winner@001
# entry for sample user sam
dn: uid=sam,ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:person
objectclass:organizationalPerson
objectclass:inetOrgPerson
cn: sam
sn: sam
uid: sam
userPassword:sam-password
# entry for sample user tom
dn: uid=tom,ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:person
objectclass:organizationalPerson
objectclass:inetOrgPerson
cn: tom
sn: tom
uid: tom
userPassword:tom-password
# create FIRST Level groups branch
dn: ou=groups,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:organizationalUnit
ou: groups
description: generic groups branch
# create the analyst group under groups
dn: cn=analyst,ou=groups,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass: groupofnames
cn: analyst
description:analyst group
member: uid=sam,ou=people,dc=hadoop,dc=apache,dc=org
member: uid=tom,ou=people,dc=hadoop,dc=apache,dc=org
# create the scientist group under groups
dn: cn=scientist,ou=groups,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass: groupofnames
cn: scientist
description: scientist group
member: uid=sam,ou=people,dc=hadoop,dc=apache,dc=org
# create the admin group under groups
dn: cn=admin,ou=groups,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass: groupofnames
cn: admin
description: admin group
member: uid=admin,ou=people,dc=hadoop,dc=apache,dc=org
# create the admin group under groups
dn: cn=winner_spark,ou=groups,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass: groupofnames
cn: winner_spark
description: winner_spark group
member: uid=winner_spark,ou=people,dc=hadoop,dc=apache,dc=org
登陆knox页面查看是否正常
https://192.168.2.75:8443/gateway/manager/admin-ui/
默认账号密码为 admin / admin-password
登录成功
三、验证Knox网关
3.1YARNUI
我们需要修改它的配置文件,添加想要代理的Web UI控制台。打开Ambari找到Knox网关的配置页面,选择Advanced topology配置项,增加YARN UI的配置,保存之后需要重启Knox网关服务。
浏览器中访问
https://192.168.2.17:8443/gateway/default/yarn
查看Knox服务日志,报错如下:
2024-11-28 11:28:58,695 bbce4516-9686-4217-aba9-f57262dc6b4b ERROR knox.gateway (GatewayDispatchFilter.java:isDispatchAllowed(167)) - The dispatch to http://windp-aio:8088/cluster was disallowed because it fails the dispatch whitelist validation. See documentation for dispatch whitelisting
需要修改一下gateway.dispatch.whitelist.services
属性,内容里删掉YARNUI
修改完成后重启,再次访问成功
在ranger中添加相关服务的用户权限
提交测试任务
spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--executor-memory 1G \
--num-executors 1 \
/usr/hdp/3.1.4.0-315/spark2/examples/jars/spark-examples_2.11-2.3.2.3.1.4.0-315.jar \
2
查看运行的任务
查看任务日志
访问任务日志可能有如下报错:
User [knox] is not authorized to view the logs for container_e04_1733724537903_0002_01_000001 in log file [windp-aio_45454_1733725107348]
No logs available for container container_e04_1733724537903_0002_01_000001
我们给yarn acl 加上knox用户就可以访问日志了
# 如下参数加上 knox
yarn.admin.acl
3.2 HDFSUI
ambari 中增加 HDFSUI配置
以下两种地址正常都可以访问到HDFSUI
https://192.168.2.17:8443/gateway/default/hdfs/?host=http://192.168.2.17:50070
https://192.168.2.17:8443/gateway/default/hdfs
HDFSUI 页面成功访问
3.3 HDFS RestFULL
默认配置如下:
测试
curl -i -k -u admin:admin-password -X GET 'https://localhost:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS'
3.4 SparkHistoryserver
https://192.168.2.17:8443/gateway/default/sparkhistory/
查看某个任务的执行细节
3.5 HBASEUI
配置如下:
访问
https://192.168.2.17:8443/gateway/default/hbase/webui/master?&host=192.168.2.17&port=16010
参考:
knox测试
https://github.com/wbwangk/wbwangk.github.io/wiki/knox%E6%B5%8B%E8%AF%95
HDP安全之集成kerberos/LDAP、ranger(knox自带LDAP):
HDP安全之集成kerberos/LDAP、ranger(knox自带LDAP) - 亲爱的不二999 - 博客园
Ambari 集成LDAP技术方案_ambari ldap-CSDN博客