システム構成
CentOS 7 上で Hadoop 2.10 の高可用性クラスタを構築します。以下の6台のサーバを準備します:
- NameNode: 2台
- DataNode: 4台
- JournalNode: 3台
- ZooKeeper: 3台
| IPアドレス | ホスト名 | 役割 |
|---|---|---|
| 192.168.30.141 | s141 | NameNode1, ZKFC, ZooKeeper |
| 192.168.30.142 | s142 | DataNode, JournalNode, ZooKeeper |
| 192.168.30.143 | s143 | DataNode, JournalNode, ZooKeeper |
| 192.168.30.144 | s144 | DataNode, JournalNode |
| 192.168.30.145 | s145 | DataNode |
| 192.168.30.146 | s146 | NameNode2, ZKFC |
基本設定
ホスト名とネットワーク設定
全サーバの/etc/hostsファイルを編集:
127.0.0.1 localhost 192.168.30.141 s141 192.168.30.142 s142 192.168.30.143 s143 192.168.30.144 s144 192.168.30.145 s145 192.168.30.146 s146
SSH鍵認証設定
NameNodeサーバ(s141, s146)で鍵ペアを生成:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
公開鍵を全サーバに配布:
# s141から配布 scp ~/.ssh/id_rsa.pub hdfs@s142:~/.ssh/id_rsa_141.pub scp ~/.ssh/id_rsa.pub hdfs@s143:~/.ssh/id_rsa_141.pub scp ~/.ssh/id_rsa.pub hdfs@s144:~/.ssh/id_rsa_141.pub scp ~/.ssh/id_rsa.pub hdfs@s145:~/.ssh/id_rsa_141.pub scp ~/.ssh/id_rsa.pub hdfs@s146:~/.ssh/id_rsa_141.pub # s146から配布 scp ~/.ssh/id_rsa.pub hdfs@s141:~/.ssh/id_rsa_146.pub scp ~/.ssh/id_rsa.pub hdfs@s142:~/.ssh/id_rsa_146.pub scp ~/.ssh/id_rsa.pub hdfs@s143:~/.ssh/id_rsa_146.pub scp ~/.ssh/id_rsa.pub hdfs@s144:~/.ssh/id_rsa_146.pub scp ~/.ssh/id_rsa.pub hdfs@s145:~/.ssh/id_rsa_146.pub
各サーバで認証鍵を設定:
cat ~/.ssh/id_rsa_141.pub >> ~/.ssh/authorized_keys cat ~/.ssh/id_rsa_146.pub >> ~/.ssh/authorized_keys chmod 644 ~/.ssh/authorized_keys
Hadoop設定
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hdfs/hadoop</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>s141:2181,s142:2181,s143:2181</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>s141:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>s146:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>s141:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>s146:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://s142:8485;s143:8485;s144:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hdfs/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hdfs/hadoop/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>s141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
クラスタ初期化
JournalNodeの起動
hadoop-daemon.sh start journalnode
NameNodeの初期化
プライマリNameNodeでファイルシステムをフォーマット:
hadoop namenode -format
スタンバイNameNodeでメタデータ同期:
hdfs namenode -bootstrapStandby
編集ログの初期化:
hdfs namenode -initializeSharedEdits
クラスタ起動
# NameNodeの起動 hadoop-daemon.sh start namenode # DataNodeの起動 hadoop-daemons.sh start datanode
ZooKeeper設定
zoo.cfg設定例
tickTime=2000 initLimit=10 syncLimit=5 dataDir=/home/hdfs/zookeeper clientPort=2181 server.1=s141:2888:3888 server.2=s142:2888:3888 server.3=s143:2888:3888
HA状態の初期化
hdfs zkfc -formatZK
自動フェイルオーバーの有効化
start-dfs.sh