打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Hadoop 安装教程
 
    Hadoop作为一个开源的分布式计算框架,现在在国内是越来越火了。包括淘宝、百度等大公司,都在大批量的使用hadoop。之前对hadoop稍微了解了一点,今天把Hadoop的分布式安装大致介绍一下。
  1. 安装前的准备工作
    1. Java JDK 1.6.x
    2. ssh 以及 sshd。保证各个服务器之前的免密码访问
  2. Hadoop下载:http://labs.renren.com/apache-mirror//hadoop/core/hadoop-0.21.0/hadoop-0.21.0.tar.gz
  3. 这个包需要在hadoop集群的所有服务器上下载并解压缩
  4. 我们计划将整个集群安装在4台服务器上。服务器名字分别为myna[5-8].其中myna5为Namenode, myna6为TaskTracker,myna[5-6]即为master,其余两台myna[7-8]为slaver。
  5. 配置这四台服务器的环境变量
    export JAVA_HOME=/path/to/bin/java
    export HADOOP_HOME=~/hadoop-0.21.0
  6. 配置myna5(NameNode)和myna6(TaskTracker):
    1. ~/hadoop-0.21.0/conf/core-site.xml:
      <property>
          <name>fs.default.name</name>
          <value>hdfs://myna5:54320</value>
          <description>The name of the default file system.    A URI whose
          scheme and authority determine the FileSystem implementation.    The
          uri's scheme determines the config property (fs.SCHEME.impl) naming
          the FileSystem implementation class.    The uri's authority is used to
          determine the host, port, etc. for a filesystem.</description>
      </property>

    2. ~/hadoop-0.21.0/conf/hdfs-site.xml:
      <property>
          <name>hadoop.tmp.dir</name>
          <value>/home/hrj/hadooptmp/hadoop-${user.name}</value>
          <description>A base for other temporary directories.</description>
      </property>
      <property>
               <name>dfs.upgrade.permission</name>
               <value>777</value>
      </property>

      <property>
               <name>dfs.umask</name>
               <value>022</value>
      </property>

    3. ~/hadoop-0.21.0/conf/mapred-site.xml:
      <property>
          <name>mapred.job.tracker</name>
          <value>myna6:54321</value>
          <description>The host and port that the MapReduce job tracker runs
          at.    If "local", then jobs are run in-process as a single map
          and reduce task.
          </description>
      </property>
      <property>
          <name>mapred.compress.map.output</name>
          <value>true</value>
          <description>Should the outputs of the maps be compressed before being
                                   sent across the network. Uses SequenceFile compression.
          </description>
      </property>
      <property>
               <name>mapred.child.java.opts</name>
               <value>-Xmx1024m</value>
      </property>

    4. 只需要配置myna5的masters
      myna6

    5. 配置myna5和myna6的slavers
      myna7
      myna8

  7. 配置myna7和myna8(Slavers)
    1. ~/hadoop-0.21.0/conf/core-site.xml:
      <property>
              <name>fs.default.name</name>
              <value>hdfs://myna5:54320</value>
              <description>The name of the default file system.        A URI whose
              scheme and authority determine the FileSystem implementation.        The
              uri's scheme determines the config property (fs.SCHEME.impl) naming
              the FileSystem implementation class.        The uri's authority is used to
              determine the host, port, etc. for a filesystem.</description>
      </property>

    2. ~/hadoop-0.21.0/conf/hdfs-site.xml:
      <property>
              <name>hadoop.tmp.dir</name>
              <value>/disk/hadooptmp/hadoop-${user.name}</value>
      </property>
      <property>
              <name>dfs.data.dir</name>
              <value>/home/hadoopdata,/disk/hadoopdata</value>
      </property>
      <property>
               <name>dfs.upgrade.permission</name>
               <value>777</value>
      </property>
      <property>
               <name>dfs.umask</name>
               <value>022</value>
      </property>
    3. ~/hadoop-0.21.0/conf/mapred-site.xml:
      <property>
          <name>mapred.job.tracker</name>
          <value>myna6:54321</value>
          <description>The host and port that the MapReduce job tracker runs
          at.    If "local", then jobs are run in-process as a single map
          and reduce task.
          </description>
      </property>
      <property>
          <name>mapred.compress.map.output</name>
          <value>true</value>
          <description>Should the outputs of the maps be compressed before being
                                   sent across the network. Uses SequenceFile compression.
          </description>
      </property>
      <property>
               <name>mapred.child.java.opts</name>
               <value>-Xmx1024m</value>
      </property>
      <property>
              <name>mapred.tasktracker.map.tasks.maximum</name>
              <value>4</value>
          </property>
          <property>
              <name>mapred.tasktracker.reduce.tasks.maximum</name>
              <value>2</value>
          </property>

  8. 编辑myna[5-8]的~/hadoop-0.21.0/conf/hadoop-env.sh,,添加JAVA_HOME环境变量
    export JAVA_HOME=/path/to/bin/java

  9. 至此,Hadoop cluster配置完成。启动Hadoop:
    1. 在myna5上启动HDFS
      hrj$  ~/hadoop-0.21.0/bin/start-dfs.sh

    2. 在myna6上启动MapRed
      hrj$ ~/hadoop-0.21.0/bin/start-mapred.sh

  10. 启动HDFS后,你可以通过前端页面访问:
    1. 访问HDFS:http://myna5:50070
    2. 访问MapRed:http://myna6:50030 
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
hadoop2.2.0 mapred
Hadoop使用(一)
hadoop capacity scheduler 配置
hadoop安装配置:使用cloudrea
CDH 的Kerberos认证配置(kerberos配置参考)
Hadoop与Hbase的配置
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服