【DataSophon】Yarn配置历史服务器JobHistory和Spark集成historyServer

news2024/12/25 8:53:55

目录

一、概述

1.1历史服务器JobHistory

1.2 Spark历史服务器HistoryServer

二、集成配置


一、概述


1.1历史服务器JobHistory


为了查看程序的历史运行情况,需要配置一下历史服务器。方便在xxxx:8088查看历史任务运行日志信息。

1.2 Spark历史服务器HistoryServer


HistoryServer服务可以让用户通过Spark UI界面,查看历史应用(已经执行完的应用)的执行细节,比如job信息、stage信息、task信息等,该功能是基于spark eventlogs日志文件的,所以必须打开eventlogs日志开关。方便在xxxx:8088查看历史任务运行日志信息。


二、集成配置


yarn-site.xml

<property>
    <name>yarn.log.server.url</name>
    <value>http://xxxxx:19888/jobhistory/logs</value>
</property>

mapred-site.xml

 <property>
     <name>mapreduce.jobhistory.address</name>
     <value>xxxxx:10020</value>
 </property>
 
 <property>
     <name>mapreduce.jobhistory.webapp.address</name>
     <value>xxxxx19888</value>
 </property>
 

如下是YARN完整的service_ddl.json

cat YARN/service_ddl.json

{
  "name": "YARN",
  "label": "YARN",
  "description": "分布式资源调度与管理平台",
  "version": "3.3.3",
  "sortNum": 2,
  "dependencies":["HDFS"],
  "packageName": "hadoop-3.3.3.tar.gz",
  "decompressPackageName": "hadoop-3.3.3",
  "roles": [
    {
      "name": "ResourceManager",
      "label": "ResourceManager",
      "roleType": "master",
      "cardinality": "1",
      "runAs": {
        "user": "yarn",
        "group": "hadoop"
      },
      "sortNum": 1,
      "logFile": "logs/hadoop-yarn-resourcemanager-${host}.log",
      "jmxPort": 9323,
      "startRunner": {
        "timeout": "60",
        "program": "control_hadoop.sh",
        "args": [
          "start",
          "resourcemanager"
        ]
      },
      "stopRunner": {
        "timeout": "600",
        "program": "control_hadoop.sh",
        "args": [
          "stop",
          "resourcemanager"
        ]
      },
      "statusRunner": {
        "timeout": "60",
        "program": "control_hadoop.sh",
        "args": [
          "status",
          "resourcemanager"
        ]
      },
      "restartRunner": {
        "timeout": "60",
        "program": "control_hadoop.sh",
        "args": [
          "restart",
          "resourcemanager"
        ]
      },
      "externalLink": {
        "name": "ResourceManager UI",
        "label": "ResourceManager UI",
        "url": "http://${host}:8088/ui2"
      }
    },
    {
      "name": "NodeManager",
      "label": "NodeManager",
      "roleType": "worker",
      "cardinality": "1",
      "runAs": {
        "user": "yarn",
        "group": "hadoop"
      },
      "sortNum": 2,
      "logFile": "logs/hadoop-yarn-nodemanager-${host}.log",
      "jmxPort": 9324,
      "startRunner": {
        "timeout": "60",
        "program": "control_hadoop.sh",
        "args": [
          "start",
          "nodemanager"
        ]
      },
      "stopRunner": {
        "timeout": "600",
        "program": "control_hadoop.sh",
        "args": [
          "stop",
          "nodemanager"
        ]
      },
      "statusRunner": {
        "timeout": "60",
        "program": "control_hadoop.sh",
        "args": [
          "status","nodemanager"
        ]
      }
    },
    {
      "name": "HistoryServer",
      "label": "HistoryServer",
      "roleType": "master",
      "cardinality": "1",
      "runAs": {
        "user": "mapred",
        "group": "hadoop"
      },
      "sortNum": 3,
      "logFile": "logs/hadoop-mapred-historyserver-${host}.log",
      "jmxPort": 9325,
      "startRunner": {
        "timeout": "60",
        "program": "control_hadoop.sh",
        "args": [
          "start",
          "historyserver"
        ]
      },
      "stopRunner": {
        "timeout": "600",
        "program": "control_hadoop.sh",
        "args": [
          "stop",
          "historyserver"
        ]
      },
      "statusRunner": {
        "timeout": "60",
        "program": "control_hadoop.sh",
        "args": [
          "status","historyserver"
        ]
      }
    },{
      "name": "YarnClient",
      "label": "YarnClient",
      "roleType": "client",
      "cardinality": "1",
      "runAs": {
        "user": "yarn",
        "group": "hadoop"
      }
    }
  ],
  "configWriter": {
    "generators": [
      {
        "filename": "yarn-site.xml",
        "configFormat": "xml",
        "outputDirectory": "etc/hadoop/",
        "includeParams": [
          "yarn.nodemanager.resource.cpu-vcores",
          "yarn.nodemanager.resource.memory-mb",
          "yarn.scheduler.minimum-allocation-mb",
          "yarn.scheduler.minimum-allocation-vcores",
          "yarn.log.server.url",
          "yarn.nodemanager.aux-services",
          "yarn.log-aggregation-enable",
          "yarn.resourcemanager.ha.enabled",
          "yarn.resourcemanager.hostname",
          "yarn.resourcemanager.address",
          "yarn.resourcemanager.webapp.address",
          "yarn.resourcemanager.scheduler.address",
          "yarn.resourcemanager.resource-tracker.address",
          "yarn.resourcemanager.store.class",
          "yarn.application.classpath",
          "yarn.nodemanager.local-dirs",
          "yarn.nodemanager.log-dirs",
          "yarn.nodemanager.address",
          "yarn.nodemanager.resource.count-logical-processors-as-cores",
          "yarn.nodemanager.resource.detect-hardware-capabilities",
          "yarn.nodemanager.resource.pcores-vcores-multiplier",
          "yarn.resourcemanager.am.max-attempts",
          "yarn.node-labels.enabled",
          "yarn.node-labels.fs-store.root-dir",
          "yarn.resourcemanager.principal",
          "yarn.resourcemanager.keytab",
          "yarn.nodemanager.principal",
          "yarn.nodemanager.keytab",
          "yarn.nodemanager.container-executor.class",
          "yarn.nodemanager.linux-container-executor.group",
          "yarn.nodemanager.linux-container-executor.path",
          "custom.yarn.site.xml"
        ]
      },
      {
        "filename": "mapred-site.xml",
        "configFormat": "xml",
        "outputDirectory": "etc/hadoop/",
        "includeParams": [
          "mapreduce.jobhistory.keytab",
          "mapreduce.jobhistory.principal",
          "mapreduce.cluster.local.dir",
          "mapreduce.jobhistory.address",
          "mapreduce.jobhistory.webapp.address",
          "custom.mapred.site.xml"
        ]
      }
    ]
  },
  "parameters": [
    {
      "name": "yarn.node-labels.fs-store.root-dir",
      "label": "Node Label存储目录",
      "description": "Node Label存储目录",
      "required": true,
      "type": "input",
      "value": "hdfs://${dfs.nameservices}/user/yarn/nodeLabels",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "hdfs://${dfs.nameservices}/user/yarn/nodeLabels"
    },
    {
      "name": "yarn.nodemanager.resource.cpu-vcores",
      "label": "nodemanager虚拟核数",
      "description": "nodemanager虚拟核数",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "-1"
    },
    {
      "name": "yarn.nodemanager.resource.memory-mb",
      "label": "Nodemanaer节点上YARN可使用的物理内存总量",
      "description": "Nodemanaer节点上YARN可使用的物理内存总量",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "2048"
    },
    {
      "name": "yarn.scheduler.minimum-allocation-mb",
      "label": "最小可分配容器的大小",
      "description": "最小可分配容器的大小",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "1024"
    }, 
    {
      "name": "yarn.scheduler.minimum-allocation-vcores",
      "label": "nodemanager最小虚拟核数",
      "description": "nodemanager最小虚拟核数",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "1"
    },
    {
      "name": "yarn.nodemanager.resource.count-logical-processors-as-cores",
      "label": "是否将物理核数作为虚拟核数",
      "description": "是否将物理核数作为虚拟核数",
      "required": true,
      "type": "switch",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": true
    },
    {
      "name": "yarn.nodemanager.resource.detect-hardware-capabilities",
      "label": "是否让yarn自己检测硬件进行配置",
      "description": "是否让yarn自己检测硬件进行配置",
      "required": true,
      "type": "switch",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": true
    },
    {
      "name": "yarn.nodemanager.resource.pcores-vcores-multiplier",
      "label": "虚拟核数与物理核数比例",
      "description": "",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "0.75"
    },
    {
      "name": "yarn.resourcemanager.am.max-attempts",
      "label": "AM重试次数",
      "description": "",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "4"
    },
    {
      "name": "yarn.nodemanager.aux-services",
      "label": "yarn服务机制",
      "description": "yarn服务机制",
      "required": true,
      "type": "input",
      "value": "mapreduce_shuffle",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "mapreduce_shuffle"
    },
    {
      "name": "yarn.log-aggregation-enable",
      "label": "是否开启yarn日志聚合",
      "description": "开启yarn日志聚合",
      "required": true,
      "type": "switch",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": true
    },
    {
      "name": "yarn.resourcemanager.ha.enabled",
      "label": "是否启用resourcemanager ha",
      "description": "是否启用resourcemanager ha",
      "configType": "ha",
      "required": true,
      "type": "switch",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": false
    },
    {
      "name": "yarn.resourcemanager.hostname",
      "label": "rm主机名",
      "description": "",
      "configType": "ha",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "${host}"
    },
    {
      "name": "yarn.resourcemanager.webapp.address",
      "label": "rm web地址",
      "description": "",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "${host}:8088"
    },
    {
      "name": "yarn.resourcemanager.address",
      "label": "ResourceManager对客户端暴露的地址rm",
      "description": "",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "${host}:8032"
    },
    {
      "name": "yarn.resourcemanager.scheduler.address",
      "label": "ResourceManager对ApplicationMaster暴露的访问地址rm",
      "description": "",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "${host}:8030"
    },
    {
      "name": "yarn.resourcemanager.resource-tracker.address",
      "label": "ResourceManager对NodeManager暴露的地址rm",
      "description": "",
      "configType": "ha",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "${host}:8031"
    },
    {
      "name": "yarn.resourcemanager.store.class",
      "label": "yarn状态信息存储类",
      "description": "",
      "configType": "ha",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore"
    },
    {
      "name": "yarn.nodemanager.address",
      "label": "nodemanager地址",
      "description": "",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "0.0.0.0:45454"
    },
    {
      "name": "yarn.log.server.url",
      "label": "historyserver地址",
      "description": "",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "http://${historyserverHost}:19888/jobhistory/logs"
    },
    {
      "name": "yarn.log-aggregation.retain-seconds",
      "label": "日志保留时长",
      "description": "",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "2592000"
    },
    {
      "name": "yarn.nodemanager.remote-app-log-dir",
      "label": "日志保留位置",
      "description": "",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "hdfs://${dfs.nameservices}/user/yarn/yarn-logs/"
    },
    {
      "name": "yarn.nodemanager.local-dirs",
      "label": "NodeManager本地存储目录",
      "description": "NodeManager本地存储目录,可配置多个,按逗号分隔",
      "required": true,
      "configType": "path",
      "separator": ",",
      "type": "multiple",
      "value": [
        "/data/nm"
      ],
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ""
    },
    {
      "name": "yarn.nodemanager.log-dirs",
      "label": "NodeManager日志存储目录",
      "description": "NodeManager日志存储目录,可配置多个,按逗号分隔",
      "required": true,
      "configType": "path",
      "separator": ",",
      "type": "multiple",
      "value": [
        "/data/nm/userlogs"
      ],
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ""
    },
    {
      "name": "mapreduce.cluster.local.dir",
      "label": "MapReduce本地存储目录",
      "description": "MapReduce本地存储目录",
      "required": true,
      "configType": "path",
      "separator": ",",
      "type": "input",
      "value": "/data/mapred/local",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "/data/mapred/local"
    },
    {
      "name": "mapreduce.jobhistory.address",
      "label": "MapReduceJobHistoryServer地址",
      "description": "MapReduceJobHistoryServer地址",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "${host}:10020"
    },
    {
      "name": "mapreduce.jobhistory.webapp.address",
      "label": "MapReduceJobHistoryServerWeb-UI地址",
      "description": "MapReduceJobHistoryServerWeb-UI地址",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "${host}:19888"
    },
    {
      "name": "yarn.application.classpath",
      "label": "yarn应用程序加载的classpath",
      "description": "yarn应用程序加载的classpath",
      "required": true,
      "separator": ",",
      "type": "multiple",
      "value":["${HADOOP_HOME}/etc/hadoop","${HADOOP_HOME}/share/hadoop/common/lib/*","${HADOOP_HOME}/share/hadoop/common/*","${HADOOP_HOME}/share/hadoop/hdfs","${HADOOP_HOME}/share/hadoop/hdfs/lib/*","${HADOOP_HOME}/share/hadoop/hdfs/*","${HADOOP_HOME}/share/hadoop/mapreduce/lib/*","${HADOOP_HOME}/share/hadoop/mapreduce/*","${HADOOP_HOME}/share/hadoop/yarn","${HADOOP_HOME}/share/hadoop/yarn/lib/*","${HADOOP_HOME}/share/hadoop/yarn/*"],
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ["${HADOOP_HOME}/etc/hadoop","${HADOOP_HOME}/share/hadoop/common/lib/*","${HADOOP_HOME}/share/hadoop/common/*","${HADOOP_HOME}/share/hadoop/hdfs","${HADOOP_HOME}/share/hadoop/hdfs/lib/*","${HADOOP_HOME}/share/hadoop/hdfs/*","${HADOOP_HOME}/share/hadoop/mapreduce/lib/*","${HADOOP_HOME}/share/hadoop/mapreduce/*","${HADOOP_HOME}/share/hadoop/yarn","${HADOOP_HOME}/share/hadoop/yarn/lib/*","${HADOOP_HOME}/share/hadoop/yarn/*"]
    },
    {
      "name": "yarn.resourcemanager.principal",
      "label": "ResourceManager服务的Kerberos主体",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "rm/_HOST@HADOOP.COM",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "rm/_HOST@HADOOP.COM"
    },
    {
      "name": "yarn.resourcemanager.keytab",
      "label": "ResourceManager服务的Kerberos密钥文件路径",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "/etc/security/keytab/rm.service.keytab",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "/etc/security/keytab/rm.service.keytab"
    },
    {
      "name": "yarn.nodemanager.principal",
      "label": "NodeManager服务的Kerberos主体",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "nm/_HOST@HADOOP.COM",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "nm/_HOST@HADOOP.COM"
    },
    {
      "name": "yarn.nodemanager.keytab",
      "label": "NodeManager服务的Kerberos密钥文件路径",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "/etc/security/keytab/nm.service.keytab",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "/etc/security/keytab/nm.service.keytab"
    },
    {
      "name": "mapreduce.jobhistory.principal",
      "label": "JobHistory服务的Kerberos主体",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "jhs/_HOST@HADOOP.COM",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "jhs/_HOST@HADOOP.COM"
    },
    {
      "name": "mapreduce.jobhistory.keytab",
      "label": "JobHistory服务的Kerberos密钥文件路径",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "/etc/security/keytab/jhs.service.keytab",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "/etc/security/keytab/jhs.service.keytab"
    },
    {
      "name": "yarn.nodemanager.container-executor.class",
      "label": "使用LinuxContainerExecutor管理Container",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor"
    },
    {
      "name": "yarn.nodemanager.linux-container-executor.group",
      "label": "NodeManager的启动用户的所属组",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "hadoop",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "hadoop"
    },
    {
      "name": "yarn.nodemanager.linux-container-executor.path",
      "label": "LinuxContainerExecutor脚本路径",
      "description": "",
      "required": false,
      "configType": "kb",
      "configWithKerberos": true,
      "type": "input",
      "value": "${HADOOP_HOME}/bin/container-executor",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "${HADOOP_HOME}/bin/container-executor"
    },
    {
      "name": "yarn.node-labels.enabled",
      "label": "启用YARN标签调度",
      "description": "开启 YARN Node Labels",
      "required": true,
      "type": "switch",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": false
    },
    {
      "name": "enableKerberos",
      "label": "启用Kerberos认证",
      "description": "启用Kerberos认证",
      "required": true,
      "type": "switch",
      "value": false,
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": false
    },
    {
      "name": "custom.yarn.site.xml",
      "label": "自定义配置yarn-site.xml",
      "description": "自定义配置",
      "configType": "custom",
      "required": false,
      "type": "multipleWithKey",
      "value": [],
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ""
    },
    {
      "name": "custom.mapred.site.xml",
      "label": "自定义配置mapred-site.xml",
      "description": "自定义配置",
      "configType": "custom",
      "required": false,
      "type": "multipleWithKey",
      "value": [],
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ""
    }
  ]
}

如下是SPARK3完整的service_ddl.json

cat SPARK3/service_ddl.json

{
  "name": "SPARK3",
  "label": "Spark3",
  "description": "分布式计算系统",
  "version": "3.1.3",
  "sortNum": 7,
  "dependencies":[],
  "packageName": "spark-3.1.3.tar.gz",
  "decompressPackageName": "spark-3.1.3",
  "roles": [
    {
      "name": "SparkClient3",
      "label": "SparkClient3",
      "roleType": "client",
      "cardinality": "1",
      "logFile": "logs/hadoop-${user}-datanode-${host}.log"
    },
    {
      "name": "SparkHistoryServer",
      "label": "SparkHistoryServer",
      "roleType": "master",
      "runAs": {
        "user": "hdfs",
        "group": "hadoop"
      },
      "cardinality": 1,
      "logFile": "logs/spark-HistoryServer-${host}.out",
      "jmxPort": 9094,
      "startRunner": {
        "timeout": "60",
        "program": "control_histroy_server.sh",
        "args": [
          "start",
          "histroyserver"
        ]
      },
      "stopRunner": {
        "timeout": "600",
        "program": "control_histroy_server.sh",
        "args": [
          "stop",
          "histroyserver"
        ]
      },
      "statusRunner": {
        "timeout": "60",
        "program": "control_histroy_server.sh",
        "args": [
          "status",
          "histroyserver"
        ]
      },
      "restartRunner": {
        "timeout": "60",
        "program": "control_histroy_server.sh",
        "args": [
          "restart",
          "histroyserver"
        ]
      },
      "externalLink": {
        "name": "SparkHistoryServer Ui",
        "label": "SparkHistoryServer Ui",
        "url": "http://${host}:18081"
      }
    }
  ],
  "configWriter": {
    "generators": [
      {
        "filename": "spark-env.sh",
        "configFormat": "custom",
        "templateName": "spark-env.ftl",
        "outputDirectory": "conf",
        "includeParams": [
          "SPARK_DIST_CLASSPATH",
          "HADOOP_CONF_DIR",
          "YARN_CONF_DIR",
          "custom.spark.env.sh"
        ]
      },
      {
        "filename": "spark-defaults.conf",
        "configFormat": "properties2",
        "outputDirectory": "conf",
        "includeParams": [
          "spark.eventLog.dir",
          "spark.history.fs.logDirectory",
          "spark.eventLog.enabled",
          "spark.master",
          "spark.history.ui.port",
          "spark.yarn.historyServer.address",
          "spark.yarn.queue",
    	  "spark.history.fs.cleaner.enabled",
          "spark.history.fs.cleaner.interval",
          "spark.history.fs.cleaner.maxAge",
          "custom.spark.defaults.conf"
        ]
      }
    ]
  },
  "parameters": [
    {
      "name": "spark.eventLog.dir",
      "label": "eventLog输出的hdfs路径",
      "description": "eventLog输出的hdfs路径",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "hdfs://${host}:8020/spark3-history"
    },
    {
      "name": "spark.history.fs.logDirectory",
      "label": "Spark历史日志HDFS目录",
      "description": "Spark历史日志HDFS目录",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "hdfs://${host}:8020/spark3-history"
    },
   {
      "name": "spark.eventLog.enabled",
      "label": "开启spark事件日志",
      "description": "开启spark事件日志",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "true"
    },
    {
      "name": "spark.master",
      "label": "开启sparkmaster",
      "description": "开启sparkmaster",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "yarn"
    },
    {
      "name": "spark.history.ui.port",
      "label": "sparkhistoryweb端口",
      "description": "sparkhistoryweb端口",
      "required": true,
      "type": "input",
      "value": "18081",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "18081"
    },
    {
      "name": "spark.yarn.queue",
      "label": "指定提交到Yarn的资源池",
      "description": "指定提交到Yarn的资源池",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "default"
    },
    {
      "name": "spark.yarn.historyServer.address",
      "label": "application的日志访问地址",
      "description": "application的日志访问地址",
      "required": true,
      "type": "input",
      "value":"",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "${host}:18081"
    },
    {
      "name": "spark.history.fs.cleaner.enabled",
      "label": "sparkhistory日志是否定时清除",
      "description": "sparkhistory日志是否定时清除",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "true"
    },
    {
      "name": "spark.history.fs.cleaner.interval",
      "label": "history-server的日志检查间隔",
      "description": "history-server的日志检查间隔",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "1d"
    },
    {
      "name": "spark.history.fs.cleaner.maxAge",
      "label": "history-server日志生命周期",
      "description": "history-server日志生命周期",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "30d"
    },
    {
      "name": "SPARK_DIST_CLASSPATH",
      "label": "spark加载Classpath路径",
      "description": "",
      "required": true,
      "configType": "map",
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "$(${HADOOP_HOME}/bin/hadoop classpath)"
    },
    {
      "name": "HADOOP_CONF_DIR",
      "label": "Hadoop配置文件目录",
      "description": "",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "${HADOOP_HOME}/etc/hadoop"
    },{
      "name": "YARN_CONF_DIR",
      "label": "Yarn配置文件目录",
      "description": "",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "${HADOOP_HOME}/etc/hadoop"
    },
    {
      "name": "custom.spark.env.sh",
      "label": "自定义配置spark-env.sh",
      "description": "自定义配置spark-env.sh",
      "configType": "custom",
      "required": false,
      "type": "multipleWithKey",
      "value": [{"SPARK_CLASSPATH":"${INSTALL_PATH}/spark-3.1.3/carbonlib/*"}],
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ""
    },
    {
      "name": "custom.spark.defaults.conf",
      "label": "自定义配置spark-defaults.conf",
      "description": "自定义配置",
      "configType": "custom",
      "required": false,
      "type": "multipleWithKey",
      "value": [],
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ""
    }
  ]
}

sparhistroyServer启停脚本

cat spark-3.1.3/control_histroy_server.sh

#!/bin/bash
#
#  Licensed to the Apache Software Foundation (ASF) under one or more
#  contributor license agreements.  See the NOTICE file distributed with
#  this work for additional information regarding copyright ownership.
#  The ASF licenses this file to You under the Apache License, Version 2.0
#  (the "License"); you may not use this file except in compliance with
#  the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
#  limitations under the License.
#

usage="Usage: start.sh (start|stop|status|restart) <command> "

# if no args specified, show usage
if [ $# -le 1 ]; then
  echo $usage
  exit 1
fi
startStop=$1
shift
command=$1
SH_DIR=`dirname $0`
export LOG_DIR=$SH_DIR/logs
export PID_DIR=$SH_DIR/pid

export HOSTNAME=`hostname`


if [ ! -d "$LOG_DIR" ]; then
  mkdir $LOG_DIR
fi

source  /etc/profile.d/datasophon-env.sh
# 创建日志路径
sudo -u hdfs /opt/datasophon/hdfs/bin/hdfs  dfs -test -d /spark3-history
if [ $? -ne 0 ] ;then
    source /etc/profile.d/datasophon-env.sh
    sudo -u hdfs /opt/datasophon/hdfs/bin/hdfs  dfs -mkdir -p  /spark3-history
    sudo -u hdfs /opt/datasophon/hdfs/bin/hdfs  dfs -chown -R hdfs:hadoop /spark3-history/
    sudo -u hdfs /opt/datasophon/hdfs/bin/hdfs  dfs -chmod -R 777  /spark3-history/

fi

start(){
	[ -w "$PID_DIR" ] ||  mkdir -p "$PID_DIR"
    ifOrNot=`ps -ef |grep HistoryServer | grep spark |grep -v "grep" |wc -l`
    if [ 1 == $ifOrNot ]
    then
      echo "$command is  running "
       exit 1
    else
      echo "$command  is not running"
    fi

  echo starting $command, logging
  exec_command="$SH_DIR/sbin/start-history-server.sh"
  echo " $exec_command"
  $exec_command

}
stop(){
    ifOrNot=`ps -ef |grep HistoryServer | grep spark |grep -v "grep" |wc -l`
    if [ 1 == $ifOrNot ];then
       ifStop=`jps | grep -E 'HistoryServer' | grep  -v 'Job*' | awk   '{print $1}'`
       if [ ! -z $ifStop ]; then
           echo "stop $command  "
           kill -9 $ifStop
       fi
    else
      echo "$command  is not running"
      exit 1
    fi

}
status(){
    ifOrNot=`ps -ef |grep HistoryServer | grep spark |grep -v "grep" |wc -l`
    if [ 1 == $ifOrNot ]
    then
      echo "$command is  running "
    else
      echo "$command  is not running"
      exit 1
    fi
}
restart(){
	stop
	sleep 10
	start
}
case $startStop in
  (start)
    start
    ;;
  (stop)
    stop
      ;;
  (status)
	  status
	;;
  (restart)
	  restart
      ;;
  (*)
    echo $usage
    exit 1
    ;;
esac


echo "End $startStop $command."

spark重新打包我们部署安装

选择SparkhistoryServer安装节点

选择Sparkclient

服务配置

安装完成

安装完后的spark-defaults.conf 配置文件

cat spark3/conf/spark-defaults.conf


spark.eventLog.dir hdfs://windp-aio:8020/spark3-history
spark.history.fs.logDirectory hdfs://windp-aio:8020/spark3-history
spark.eventLog.enabled true
spark.master yarn
spark.history.ui.port 18081
spark.yarn.historyServer.address windp-aio:18081
spark.yarn.queue default
spark.history.fs.cleaner.enabled true
spark.history.fs.cleaner.interval 1d
spark.history.fs.cleaner.maxAge 30d

datasophon页面

提交spark-examples测试任务

spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--executor-memory 1G \
--num-executors 2 \
/opt/datasophon/spark3/examples/jars/spark-examples_2.12-3.1.3.jar \
10

查看YARN

查看SPARK

至此集成完成。 

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2130683.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

夜幕降临 Nightfall Comes,官方中文,解压即玩,

游戏截图 《Nightfall Comes》是一款基于俯视角3D动作的Rogue-lite游戏&#xff0c; 讲述了四位角色的故事&#xff0c;他们怀着对已灭亡世界的最后执念&#xff0c;聚集在一个领地上的城堡中。 下载地址&#xff1a; 夜幕降临 Nightfall Comes,官方中文,解压即玩,

老板:IT部门一天到晚都在干什么?

最近某集团公司的一位CIO与笔者闲聊&#xff0c;吐槽道&#xff1a;老板问“IT部门一天到晚都在干什么&#xff1f;你们不就是修修电脑搞搞网络吗&#xff1f;要那么多人干什么&#xff1f;” 这位CIO其实是说出了老板对IT部门的质疑&#xff0c;背后折射的是企业数字化转型过…

安装office后,右键-》新建,不能创建word、PPT、Excel

原因是安装没有选择默认安装 解决办法&#xff1a; 文件的后缀名&#xff1a; 打开注册表&#xff1a; word: Excel&#xff1a; PPT&#xff1a; 最后重启一下电脑 效果&#xff1a;

七大技术 增强轧钢测径仪的抗干扰能力!

轧钢测径仪是在线检测设备&#xff0c;也就意味着其需要直面钢材&#xff0c;这些钢材在生产测量时&#xff0c;大部分温度高&#xff0c;环境中还有水汽、氧化铁皮等干扰测量的因素&#xff0c;因此在对轧钢测径仪设计之初就考虑到了现场的各种情况&#xff0c;提高测径仪的防…

极狐GitLab CI/CD 作业一直处于等待状态,如何解决?

本分分享 GitLab CI/CD Job 不工作的的故障排查方法&#xff1a;当 GitLab Runner 不接受 Job&#xff0c;Job 一直处于等待状态&#xff0c;如何解决此问题。 极狐GitLab 为 GitLab 在中国的发行版&#xff0c;中文版本对中国用户更友好。极狐GitLab 支持一键私有化部署&…

Linux arm-linux-gcc 内核编译

1.下载 arm-linux-gcc 4.4.3版本后更新路径 sudo vim /home/linux/.bashrc 的最后一行加入路径 或者 sudo vim /erc/profile 最后一行加路径 如果使用第一个改&#xff0c;则注释掉第二个的最后一行的旧的arm-linux-gcc 的路径 版本更新好后&#xff0c;使用编译可能还有错误&a…

如何评估一个RAG(检索增强生成)系统-上篇

最近项目中需要评估业务部门搭建的RAG助手的效果好坏&#xff0c;看了一下目前业界一些评测的方法。目前分为两大类&#xff0c;基于传统的规则、机器学习的评测方法&#xff0c;基于大模型的评测方法。在这里做一些记录&#xff0c;上篇主要做评测方法的记录&#xff0c;下篇会…

盘点2024年大家都在用的录屏工具

现在录屏工具的使用范围越来越广了。我的深切体验是有很多人愿意为知识付费了&#xff0c;但是到线下培训的话很多人时间不一定能协调的来&#xff0c;这就导致涌现了不少的录屏课程。这次我们来探讨下要怎么录屏才能呈现更好的效果。 1.福昕录屏大师 链接达达&#xff1a;ww…

基于 ROS 的Terraform托管服务轻松部署Stable Diffusion

介绍 Stable Diffusion是文本到图像的潜在扩散模型&#xff0c;它可以根据文本提示词生成对应的图像。Stable Diffusion模型是扩散模型的变种&#xff0c;可以通过逐步对随机高斯噪声进行去噪以获得感兴趣的样本。与传统生成模型相比&#xff0c;Stable Diffusion可以在不需要…

数据结构修炼——时间复杂度?空间复杂度?教你如何衡量算法的优劣!!

目录 一、关于复杂度二、时间复杂度1 概念2 大O的渐进表示法3 练习3.1 练习13.2 练习23.3 练习33.4 练习43.5 练习53.6 练习6 三、空间复杂度1 概念2 练习2.1 练习12.2 练习22.3 练习32.4 练习4 小结 一、关于复杂度 算法在编写成可执行程序后&#xff0c;运行时需要耗费时间资…

C++ | Leetcode C++题解之第402题移掉K位数字

题目&#xff1a; 题解&#xff1a; class Solution { public:string removeKdigits(string num, int k) {vector<char> stk;for (auto& digit: num) {while (stk.size() > 0 && stk.back() > digit && k) {stk.pop_back();k - 1;}stk.push_b…

神经网络多层感知器异或问题求解-学习篇

多层感知器可以解决单层感知器无法解决的异或问题 首先给了四个输入样本&#xff0c;输入样本和位置信息如下所示&#xff0c;现在要学习一个模型&#xff0c;在二维空间中把两个样本分开&#xff0c;输入数据是个矩阵&#xff0c;矩阵中有四个样本&#xff0c;样本的维度是三维…

以太坊学习

以太坊原理书 区块链的一些基础概念 账户是什么 以太坊的账户共分成两类&#xff0c;外部账户 (Externally Owned Account, EOA ) 与 智能合约 (Contract Account, CA )。 外部账户由 一把私钥 与该私钥对应的公开地址来表示&#xff0c;是普通用户的账户。智能合约账户 没…

一、SpringBoot自动装配原理

目录 一、背景二、原理 一、背景 在Spring的时候&#xff0c;如果我们需要引进一些第三方依赖&#xff0c;还行需要写一些配置类或者xml文件对第三方依赖进行配置。在SpringBoot之后&#xff0c;我们只需要引入这些第三方依赖的一个starter&#xff0c;然后在application.yaml…

pip install速度太慢的多种解决方案

目录 问题描述为什么 pip 速度这么慢&#xff1f;解决方案1. 使用国内镜像源2. 配置多个镜像源3. 使用第三方工具4. 手动下载后本地安装5. 优化网络环境6. 更新 pip 版本 测试效果 问题描述 在使用 Python 进行开发时&#xff0c;我们经常需要使用 pip 来安装第三方库。然而&am…

图片详解,最简单易懂!!!Ubuntu增强功能

文章目录 共享粘贴板共享文件夹vim 共享粘贴板 双击 输入认证用户密码 显示这个界面 命令行界面 reboot就会重新启动 虚拟机 设置下面的选项&#xff0c;就可以实现共享粘贴 共享文件夹 先在本机&#xff08;windows&#xff09;新建一个共享文件夹 再在ubuntu建一个共…

minio的下载和springboot整合minio使用

下载minio 基于docker拉取minio镜像 docker pull minio/minio 创建minio容器 先创建/root/minio/data 和 /root/minio/conf docker run -d -p 9000:9000 -p 9001:9001 --name minio \ -e "MINIO_ROOT_USERhtsdg" \ -e "MINIO_ROOT_PASSWORD12345678" \ -v…

CTK框架(十):PluginAdmin插件

目录 1.引言 2.实现原理 3.实际应用 3.1.界面控制 3.2.访问服务管理插件 4.总结 1.引言 在CTK框架(三): 插件的安装讲解了插件的安装、启动、停止和卸载方法&#xff0c;对于一个插件可以这样写&#xff1b;但是如果是在一个大型的应用程序中&#xff0c;里面有很多插件&…

Android SPN/PLMN 显示逻辑简介

功能描述 当设备驻网后(运营商网络),会在状态栏、锁屏界面、下拉控制中心显示运营商的名称。 此名称来源有两种: 1、SPN(Service Provider Name) 2、PLMN (Public Land Mobile Name) 功能AOSP默认逻辑SPN提供SIM卡的运营商名称预置在SIM EF中,SIM卡发行运营商名称…

GESP等级考试C++二级-ASCII码与字符

1 ASCII码介绍 ASCII码是American Standard Code for Information Interchange&#xff0c;即美国标准信息交换码的简称。 ASCII码的主要功能是将键盘上输入的字符&#xff08;如数字、字母、特殊符号等&#xff09;转换为计算机能够识别的二进制数&#xff0c;即每个字符都对…