Hadoop简易快速搭建(官网流程版)

    技术2022-07-10  197

    环境准备

    一个纯净版的centos7 虚拟机,配置好静态ip,主机名,主机映射

    配置静态ip

    vi /etc/sysconfig/network-scripts/ifcfg-eno16777736 ----------------------------------------------------- TYPE=Ethernet BOOTPROTO=none DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no NAME=eno16777736 UUID=f19fae49-46da-4b52-b704-6e1ec4c0470e ONBOOT=yes HWADDR=00:0C:29:EC:14:1F IPADDR0=192.168.191.101 PREFIX0=24 GATEWAY0=192.168.191.2 IPV6_PEERDNS=yes IPV6_PEERROUTES=yes DNS1=114.114.114.114 DNS2=8.8.8.8

    配置主机映射

    vi /etc/hosts ----------------------------------------------------- 192.168.191.101 hadoop1

    在/opt下创建两个文件夹,software,install

    cd /opt mkdir software install

    免密登陆

    ssh-keygen #3次回车 #拷贝密钥 ssh-copy-id hadoop1

    安装jdk

    #解压 tar zxvf jdk-8u171-linux-x64.tar.gz -C /opt/install/ #配置环境变量 vi /etc/profile export JAVA_HOME=/opt/install/jdk1.8.0_171 export PATH=$JAVA_HOME/bin:$PATH #刷新环境变量 source /etc/profile #验证 java -version java version "1.8.0_171" Java(TM) SE Runtime Environment (build 1.8.0_171-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

    安装Hadoop

    Hadoop 的安装模式分为3种:单机(本地)模式,伪分布式,完全分布式(集群模式)

    本地模式安装

    解压

    tar zxvf hadoop-2.6.0-cdh5.14.2.tar.gz -C /opt/install/

    环境变量

    vi /etc/profile --------------------------- # hadoop export HADOOP_HOME=/opt/install/hadoop-2.6.0-cdh5.14.2 export PATH=$HADOOP_HOME/bin:$PATH --------------------------------- source /etc/profile

    测试

    bin/hadoop

    官方示例演示

    mkdir input cp etc/hadoop/*.xml input bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.14.2.jar grep input output 'dfs[a-z.]+' cat output/*

    伪分布式搭建

    修改HDFS配置文件

    etc/hadoop/hadoop-env.sh
    export JAVA_HOME=/opt/install/jdk1.8.0_221
    etc/hadoop/core-site.xml
    <property> <name>fs.defaultFS</name> <value>hdfs://hadoop1:9000</value> </property>
    etc/hadoop/hdfs-site.xml
    <property> <name>dfs.replication</name> <value>1</value> </property>
    etc/hadoop/slaves
    hadoop1

    格式化文件系统

    bin/hdfs namenode -format

    启动HDFS

    sbin/start-dfs.sh

    验证是否成功

    jps 20896 Jps 20787 SecondaryNameNode 20521 NameNode 20638 DataNode

    使用web浏览器访问50070端口,查看是否能打开

    文件管理界面

    官方示例演示

    hdfs dfs -mkdir /input hdfs dfs -put etc/hadoop/*.xml /input bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.14.2.jar grep /input /output 'dfs[a-z.]+' hdfs dfs -cat /output/part-r-00000

    修改Yarn配置

    mapred-site.xml

    cp etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml -------------------------- <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>

    yarn-site.xml

    <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>

    启动Yarn

    sbin/start-yarn.sh

    查看jps进程

    jps 22304 ResourceManager 21989 DataNode 22389 NodeManager 21868 NameNode 22143 SecondaryNameNode 22703 Jps

    使用web浏览器访问8088端口,查看是否能打开

    官方示例演示

    hdfs dfs -rm -r -f /output bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.14.2.jar grep /input /output 'dfs[a-z.]+' hdfs dfs -cat /output/part-r-00000
    Processed: 0.012, SQL: 9