hadoop伪分布式安装(Linux)

    技术2022-07-11  130

    基础设施

    确保网络没问题

    ping www.baidu.com

    进入超级用户状态

    su -

    yum安装,确保能用ifconfig命令和拖文件

    yum -y install net-tools

    yum -y install lrzsz

    设置网络

    设置IP

    cd /etc/sysconfig/network-scripts

    vi ifcfg-ens33 TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static #dhcp改成static #必填 DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=ens33 UUID=6cd6aa17-1522-4003-a117-e6301e3d20c2 DEVICE=ens33 ONBOOT=yes IPV6_PRIVACY=no IPADDR=192.168.204.131 #虚拟机ip #必填 PREFIX=24 #必填 GATEWAY=192.168.204.1 #必填 DNS1=8.8.8.8 #必填 DNS2=114.114.114.114 #必填

    service network restart

    设置主机名

    vi /etc/sysconfig/network NETWORKING=yes HOSTNAME=node01

    设置本机的ip到主机名的映射关系

    vi /etc/hosts 192.168.204.131 node01

    关闭防火墙

    systemctl stop firewalld.service systemctl disable firewalld.service firewall-cmd --state

    关闭 selinux

    vi /etc/selinux/config SELINUX=disabled

    做时间同步

    yum install ntp -y vi /etc/ntp.conf server ntp1.aliyun.com service ntpd start chkconfig ntpd on

    安装JDK

    rpm -i jdk-8u251-linux-x64.rpm vi /etc/profile export JAVA_HOME=/usr/java/default export PATH=$PATH:$JAVA_HOME/bin source /etc/profile

    ssh免密

    ssh localhost ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

    Hadoop的配置(应用的搭建过程)

    规划路径

    mkdir /opt/bigdata tar -zxvf hadoop-2.9.2.tar.gz mv hadoop-2.9.2 /opt/bigdata/ vi /etc/profile export JAVA_HOME=/usr/java/default export PATH=$PATH:$JAVA_HOME/bin export HADOOP_HOME=/opt/bigdata/hadoop-2.9.2 export PATH=$PATH:$HADOOP_HOME/sbin source /etc/profile

    配置hadoop的角色

    cd $HADOOP_HOME/etc/hadoop

    必须给hadoop配置javahome要不ssh过去找不到

    vi hadoop-env.sh export JAVA_HOME=/usr/java/default

    给出NN角色在哪里启动

    vi core-site.xml <property> <name>fs.defaultFS</name> <value>hdfs://node01:9000</value> </property>

    配置hdfs 副本数为1.

    vi hdfs-site.xml <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/var/bigdata/hadoop/local/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/var/bigdata/hadoop/local/dfs/data</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>node01:50090</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>/var/bigdata/hadoop/local/dfs/secondary</value> </property>

    配置DN这个角色再那里启动

    vi slaves node01

    初始化&启动

    hdfs namenode -format start-dfs.sh jps

    http://node01:50070

    修改windows: C:\Windows\System32\drivers\etc\hosts

    192.168.204.131 node01

    简单使用:

    hdfs dfs -mkdir /bigdata hdfs dfs -mkdir -p /user/root cd /root hdfs dfs -put hadoop-2.9.2.tar.gz /user/root

    验证知识点:

    cd /var/bigdata/hadoop/local/dfs/name/current 观察 editlog的id是不是再fsimage的后边 cd /var/bigdata/hadoop/local/dfs/secondary/current SNN 只需要从NN拷贝最后时点的FSimage和增量的Editlog hdfs dfs -put hadoop*.tar.gz /user/root cd /var/bigdata/hadoop/local/dfs/data/current/BP-281147636-192.168.150.11-1560691854170/current/finalized/subdir0/subdir0 for i in `seq 100000`;do echo "hello hadoop $i" >> data.txt ;done hdfs dfs -D dfs.blocksize=1048576 -put data.txt cd /var/bigdata/hadoop/local/dfs/data/current/BP-281147636-192.168.150.11-1560691854170/current/finalized/subdir0/subdir0 检查data.txt被切割的块,他们数据什么样子
    Processed: 0.011, SQL: 9