Hadoop在YARN上的伪分布式安装(Mac)

前提条件

电脑上已经安装了SSH,Homebrew,JDK8。

配置SSH无密码登录自己电脑

在终端执行以下命令:

$ ssh localhost

如果需要输入密码登录,则执行以下命令:

$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys

再次输入ssh localhost则应该可以不用密码就可以登录

用homebrew安装hadoop

执行以下命令来安装hadoop:

$ brew install apache-hadoop

则会安装最新的hadoop版本。

伪分布式配置hadoop

进入 /usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/core-site.xml,加入以下配置:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

进入/usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/hdfs-site.xml,加入以下配置:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

进入/usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/hadoop-env.sh,将

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

修改为

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="

进入/usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/yarn-env.sh,添加

YARN_OPTS="$YARN_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"

进行YARN配置,配置mapred-site.xml

cp /usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/mapred-site.xml.template /usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/mapred-site.xml

打开mapred-site.xml,加入以下配置:

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx4096m</value>
  </property>
</configuration>

打开/usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/yarn-site.xml,加入以下配置:

<property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
</property>

格式化HDFS

$ rm -rf /tmp/hadoop-yourusername
$ hadoop namenode -format

启动HDFS和MapReduce

启动HDFS

$ /usr/local/Cellar/hadoop/2.8.1/sbin/start-dfs.sh

启动MapReduce

$ /usr/local/Cellar/hadoop/2.8.1/sbin/start-yarn.sh

检查启动情况

$ jps

正常启动应该会有以下5个java进程。

NameNode
SecondaryNameNode
DataNode
NodeManager
ResourceManager

可视化查看

Cluster Status http://localhost:8088
HDFS status http://localhost:50070
secondaryNamenode http://localhost:50090

Author: MrHook
Link: https://bigjar.github.io/2018/01/29/Hadoop%E5%9C%A8YARN%E4%B8%8A%E7%9A%84%E4%BC%AA%E5%88%86%E5%B8%83%E5%BC%8F%E5%AE%89%E8%A3%85-Mac/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.