Ubuntu上安装 Hadoop 3
前提条件:
- Python 推荐3.8
- JDK 推荐1.8
解压安装
sudo tar -zxvf hadoop-3.3.0.tar.gz -C /usr/local
cd /usr/local
sudo mv hadoop-3.3.0 hadoop
sudo chown -R hadoop ./hadoop
配置环境变量
vim ~/.bashrc
# hadoop
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_CLASS_PATH=$HADOOP_CONF_DIR
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
source ~/.bashrc
安装检测
cd /usr/local/hadoop/bin
./hadoop version
伪分布式配置
修改配置文件,文件位于hadoop包的/etc/hadoop下
编辑 hadoop-env.sh
输入echo $JAVA_HOME 查询JDK
cd /usr/local/hadoop/etc/hadoop
sudo vim hadoop-env.sh
export JAVA_HOME=/usr/local/jdk
编辑 core-site.xml
sudo vim core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
编辑 hdfs-site.xml
sudo vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/data</value>
</property>
</configuration>
执行 NameNode 的格式化
./bin/hdfs namenode -format
启动namenode和datanode进程
cd /usr/local/hadoop
./sbin/start-dfs.sh
./sbin/stop-dfs.sh
查看启动结果
jps
报错hadoop: hadoop@hadoop: Permission denied (publickey,password).
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
chmod 700 ~/.ssh
hadoop3的webUI已经改到端口 localhost:9870上面,而不是原来的50070
配置yarn(非必须)
cd /usr/local/hadoop/
cp etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml
修改etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
修改etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
启动资源管理器
./sbin/start-yarn.sh
./sbin/mr-jobhistory-daemon.sh start historyserver
问题输入./sbin/start-yarn.sh会显示权限不够 则输入
sudo chmod 777
sbin进行赋权限
分布式集群部署待续
修改hadoop-3.3.0/etc/hadoop 目录下文件
- core-site.xml
- hadoop-env.sh
- hdfs-site.xml
- mapred-site.xml
- yarn-site.xml
- workers