• 2020-09-04


    1.配置五个配置文件

    1.配置core-site-xml

    fs-defaultFS

    hdfs://haddop102:9820

    hadoop.tmp.dir

    /opt/module/hadoop-3.1.3/data

    hadoop.http.staticuser.user

    atguigu

    hadoop.proxyuser.atguigu.hosts _ hadoop.proxyuser.atguigu.groups _ hadoop.proxyuser.atguigu.groups \*

    2.配置htfs-site-xml

    dfs:namenode.http-address

    hadoop102:9870

    dfs.namenode.secondary.http-address hadoop104:9868

    3.配置yarn。

    yarn.nodemanger.aux-services

    mapreduce_shuffle

    yarn.resuorcemanger.hostname

    hadoop103

    yarn.nomemanger.env-whitelist

    JAVA_HOME,HADOOP_COMMON.HADOOP_HDFS,HADOOP_CONF_DIR,CLASSPATH_REEPEND_DISSTCTCACHE,HADOOP_YARN,HADOOP_MAP

    yarn.log-aggregation-enable

    true

    yarn.log.server.url

    http://hadoop102:19888/jobhistory/logs

    yarn.log-aggregation.retain-seconds

    604800

    yarn.scheduler.minimum-allocation-mb 512 yarn.scheduler.maximum-allocation-mb 4096 yarn.nodemanager.resource.memory-mb 4096 yarn.nodemanager.pmem-check-enabled false yarn.nodemanager.vmem-check-enabled false

    4.配置mapred-site-xml

    mapreduce.framework.name

    yarn

    mapreduce.jobhistory.address

    hadoop102:10020

    mapreduce.jobhistory.address

    hadoop102:19888

    5.配置workers

    vim /opt/module/hadoop-3.1.3/etc/hadoop/workers

    hadoop102

    hadoop103

    hadoop104

    6.分发配置文件:

    [atguigu@hadoop102 hadoop]$xsync /opt/module/hadoop-3.1.3/etc/hadoop/

    7.启动集群。

    格式化:

    [atguigu@hadoop102 ~]$hdfs namenode -format

    //启动hdfs

    [atguigu@hadoop102 hadoop-3.1.3]$ /sbin start-dfs.sh

    //启动yarn

    [atguigu@hadoop102 hadoop-3.1.3]$ /sbin /start-yarn.sh

    访问web端的htfs功能。

    http://hadoop102:9870

    访问web端的resoucemanger的

    http://hadoop103:8088

    (4)Web端查看HDFS的NameNode

    (a)浏览器中输入:http://hadoop102:9870

    (b)查看HDFS上存储的数据信息

    (5)Web端查看YARN的ResourceManager

    (a)浏览器中输入:http://hadoop103:8088

    (b)查看YARN上运行的Job信息.

    集群的基本测试。

    在namenode 网页端创建文件夹

    [atguigu@hadoop102~]$ hadoop fs -mkdir /input

    把文件上传到文件夹

    [atguigu@hadoop102 ~ ]$hadoop fs - put /home/atguigu/wcincuput/word.txt /input

    //查看上传的文件的储存路径。

    /opt/module/hadoop-3.1.3/data/dfs/data/current/BP-440821944-192.168.16.102-1599035131869/current/finalized/subdir0/subdir0

    执行mapreduce中的wordcount程序

    [atguigu@hadoop102 hadoop-3.1.3]$hadoop jar share/hadoop/mapreduce/hadoop-maptreduce-examples-3.1.3 .jar wordcount /input /output 输出的文件output不能存在。

    8.集群启动的方法。

    1)单个启动集群。

    启动HDFS和停止HDFS

    [atguigu@hadoop102 ~]$hdfs --daemon start/stop namenode/datanode/secondarynode

    启动和关闭YARN

    [atguigu@hadoop 102 ~]$yarn --daemon start /stop resourcemanger/nodemanger

    启动和关闭历服务器

    [atguigu@hadoop 102 ~]$mapred --daemon start/stop historyserver

    2)模块化启动hdfs和yarn.

    启动和关闭hdfs

    [atguigu@hadoop 102 ~]$start/stop-dfs.sh

    启动和关闭yarn

    [atguigu@hadoop102 ~ ]$start/stop-yarn.sh

    模块启动和关闭历史服务器

    [atguigu@hadoop102 ~]$mapred --daemon stop/satrt historyserver

    [atguigu@hadoop102 ~ ]$hadoop fs -mkdir input

    9.//查看java进程的脚本

    $cd /home/atguigu/bin/

    $cd /home/atguigu/bin/

    vim /jpsall

    chmod 755 jpsall

    #!bin/bash

    for $host in hadoop102 hadoop103 hadoop104

    do

    echo “=$host=”

    ssh $host jps $@ | grep -v jps

    done

    //关闭和开启hdfs和yarn还有历史服务器的脚本

    $touch myhadoop.sh

    $vim hadoop.sh

    #!/bin/bash

    if [ $# -lt 1];

    then

    echo “No Args Input”

    exit;

    fi

    case $1 in " start")

    echo " =启动集群==="

    echo “启动HDFS

    ssh hadoop102 “/opt/module/hadoop-3.1.3/sbin/start-dfs.sh”

    echo “启动YARN

    ssh hadoop103 “/opt/module/hadoop-3.1.3/sbin/start-yarn.sh”

    echo “===启动历史服务器=”

    ssh hadoop102 " /opt/module/hadoop-3.1.3/bin/mapred --daemon start historyserver"

    ;;

    “stop”)

    echo “关闭集群=======”

    echo “关闭历史服务器==”

    ssh hadoop102 “/opt/module/hadoop-3.1.3/bin/mapred --daemon start historyserver”

    echo “关闭YARN=====”

    ssh hadoop103 “/opt/module/hadoop-3.1.3/sbin/stop-yarn.sh”

    echo " ====关闭HDFS="

    ssh hadoop102 " /opt/module/hadoop-3.1.3/sbin/stop-dfs.sh"

    ;;

    *)

    echo “输入有误”

    exit

    ;;

    easc

    $chmod 755 myhadoop.sh

    //分发到其他两台服务器

    $xsync /home/atguigu/bin/

    //配置集群时间同步

    10.配置时间同步具体实操:

    1**)时间服务器配置(必须root用户)**

    (0)查看所有节点ntpd服务状态和开机自启动状态

    [atguigu@hadoop102 ~]$ sudo systemctl status ntpd

    [atguigu@hadoop102 ~]$ sudo systemctl is-enabled ntpd

    (1)在所有节点关闭ntp服务和自启动

    [atguigu@hadoop102 ~]$ sudo systemctl stop ntpd

    [atguigu@hadoop102 ~]$ sudo systemctl disable ntpd

    (2)修改hadoop102的ntp.conf配置文件

    [atguigu@hadoop102 ~]$ sudo vim /etc/ntp.conf

    修改内容如下

    a)修改1(授权192.168.1.0-192.168.1.255网段上的所有机器可以从这台机器上查询和同步时间)

    #restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

    为restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

    b)修改2(集群在局域网中,不使用其他互联网上的时间)

    server 0.centos.pool.ntp.org iburst

    server 1.centos.pool.ntp.org iburst

    server 2.centos.pool.ntp.org iburst

    server 3.centos.pool.ntp.org iburst

    **#**server 0.centos.pool.ntp.org iburst

    **#**server 1.centos.pool.ntp.org iburst

    **#**server 2.centos.pool.ntp.org iburst

    **#**server 3.centos.pool.ntp.org iburst

    c)添加3(当该节点丢失网络连接,依然可以采用本地时间作为时间服务器为集群中的其他节点提供时间同步)

    (3)修改hadoop102的/etc/sysconfig/ntpd 文件

    [atguigu@hadoop102 ~]$ sudo vim /etc/sysconfig/ntpd

    增加内容如下(让硬件时间与系统时间一起同步)

    SYNC_HWCLOCK=yes

    (4)重新启动ntpd服务

    [atguigu@hadoop102 ~]$ sudo systemctl start ntpd

    (5)设置ntpd服务开机启动

    [atguigu@hadoop102 ~]$ sudo systemctl enable ntpd

  • 相关阅读:
    入门必读:Python try except异常处理详解
    supervisorctl(-jar)启动配置设置NACOS不同命名空间
    Java的进化之路走到了尽头
    Springboot——jxls实现同sheet多个列表展示
    每日一练——单链表排序
    条件分支和循环机制、标志寄存器及函数调用机制
    Vue2/3 项目中的 ESLint + Prettier 代码检测格式化风格指南
    力扣刷题记录(Java)(一)
    MathType2024苹果版数学公式编辑器
    三、逻辑代数基础
  • 原文地址:https://blog.csdn.net/segegefe/article/details/126325546